AWS Machine Learning Blog 08月15日
Bringing agentic Retrieval Augmented Generation to Amazon Q Business
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon Q Business 推出 Agentic RAG,一种基于智能代理的检索策略,显著提升企业数据交互能力。该技术能分解复杂查询,支持多轮对话,并智能调用多种数据探索工具(如表格搜索、长上下文检索),以提供更精准、全面的答案。Agentic RAG 允许用户实时查看数据检索过程,增强透明度和信任感。通过动态优化响应,即使在信息不全时也能主动进行二次检索,确保回答的完整性。此项革新使企业能够更深入地挖掘数据价值,提升工作效率。

💡 Agentic RAG 核心是引入了智能代理,能够动态规划和执行复杂的检索策略,以应对传统 RAG 在处理企业复杂查询时的局限性。它通过将复杂问题分解为可管理的部分,并智能选择和使用数据探索工具,从而提供更准确、更全面的响应。

🔍 Agentic RAG 提升了查询处理的透明度,通过“透明事件”功能,用户可以实时查看查询分解、文档检索路径以及响应生成的工作流程。这种可视性增强了用户对系统决策过程的信心,并能深入了解响应生成机制。

🛠️ Agentic RAG 具备智能工具使用能力,代理可以根据检索计划动态调用多种数据探索工具,包括表格搜索(通过代码生成或表格线性化)和长上下文检索(用于处理需要完整文档上下文的查询)。这超越了传统 RAG 依赖片段检索的不足。

💬 Agentic RAG 增强了对话能力,支持多轮交互和短期记忆,使对话更自然、更具上下文感知。当存在多种可能答案时,代理会主动询问澄清性问题,以准确理解用户意图,从而提供更精确的响应。

📈 Agentic RAG 实现了动态响应优化,代理会主动评估和迭代优化响应质量,识别初始检索的不足并自主发起额外搜索或调整策略。这确保了在处理复杂主题时,能够捕捉所有相关信息并保持对话上下文的连贯性。

Amazon Q Business is a generative AI-powered enterprise assistant that helps organizations unlock value from their data. By connecting to enterprise data sources, employees can use Amazon Q Business to quickly find answers, generate content, and automate tasks—from accessing HR policies to streamlining IT support workflows, all while respecting existing permissions and providing clear citations. At the heart of systems like Amazon Q Business lies Retrieval Augmented Generation (RAG), which enables AI models to ground their responses in an organization’s enterprise data.

The evolution of RAG

Traditional RAG implementations typically follow a straightforward approach: retrieve relevant documents or passages based on a user query, then generate a response using these documents or passages as context for the large language model (LLM) to respond. While this methodology works well for basic, factual queries, enterprise environments present uniquely complex challenges that expose the limitations of this single-shot retrieval approach.

Consider an employee asking about the differences between two benefits packages or requesting a comparison of project outcomes across multiple quarters. These queries require synthesizing information from various sources, understanding company-specific context, and often need multiple retrieval steps to gather comprehensive information around each aspect of the query.

Traditional RAG systems struggle with such complexity, often providing incomplete answers or failing to adapt their retrieval strategy when initial results are insufficient. When processing these more involved queries, users are left waiting without visibility into the system’s progress, leading to an opaque experience.

Bringing agency to Amazon Q Business

Bringing agency to Amazon Q Business is a new paradigm to handle sophisticated enterprise queries through intelligent, agent-based retrieval strategies. By introducing AI agents that dynamically plan and execute sophisticated retrieval strategies with a suite of data navigation tools, Agentic RAG represents a significant evolution in how AI assistants interact with enterprise data, delivering more accurate and comprehensive responses while maintaining the speed users expect.

With Agentic RAG in Amazon Q Business you have several new capabilities, including query decomposition and transparent events, agentic retrieval tool use, improved conversational capabilities, and agentic response optimization. Let’s dive deeper into what each of these mean.

Query decomposition and transparent response events

Traditional RAG systems often face significant challenges when processing complex enterprise queries, particularly those involving multiple steps, composite elements, or comparative analysis. With this release of Agentic RAG in Amazon Q Business, we aim to solve this problem through sophisticated query decomposition techniques, where AI agents intelligently break down complex questions into discrete, manageable components.

When an employee asks Please compare the vacation policies of Washington and California?, the question is decomposed into two queries on Washington and California policies. The first decomposed query being washington state vacation policies and the second query being california state vacation policies.

Because Agentic RAG presumes a series of parallel steps to explore the data source and collect thorough information for more accurate query resolution, we are now providing real-time visibility into its processing steps that will be displayed on the screen as data is being retrieved to generate the response. After the response is generated, the steps will be collapsed with the response streamed. In the following image, we see how the decomposed queries are displayed and the relevant data retrieved for response generation.

This allows users to see meaningful updates to the system’s operations, including query decomposition patterns, document retrieval paths, and response generation workflows. This granular visibility into the system’s decision-making process enhances user confidence and provides valuable insights into the sophisticated mechanisms driving accurate response generation.

This agentic solution facilitates comprehensive data collection and enables more accurate, nuanced responses. The result is enhanced responses that maintain both granular precision and holistic understanding of complex, multi-faceted business questions, while relying on the LLM to synthesize the information retrieved. As shown in the following image, the information fetched individually for California and Washington vacation policies were synthesized by the LLM and presented in a rich markdown format.

Agentic tool use

The designed RAG agents can intelligently deploy various data exploration tools and retrieval methods in optimal strategies by thinking about the retrieval plan while maintaining context over multiple turns of the conversations. These retrieval tools include tools built within Amazon Q Business such as tabular search, allowing intelligent retrieval of data through either code generation or tabular linearization across small and large tables embedded in documents (such as DOCX, PPTX, PDF, and so on) or stored in CSV or XLSX files. Another retrieval tool includes long context retrieval, which determines when the full context of a document is required for retrieval. An example of long context retrieval: if a user asks a query such as Summarize the 10K of Company X, the agent could identify the query’s intent as a summarization query that requires document-level context and, as a result, deploy the long context retrieval tool that fetches the complete document—the 10K of Company X—as part of the context for the LLM to generate a response (as shown in the following figure). This intelligent tool selection and deployment represents a significant advancement over traditional RAG systems, which often rely on fragmented passage retrieval that can compromise the coherence and completeness of complex document analysis for question answering.

Improved conversational capabilities

Agentic RAG introduces multi-turn query capabilities that elevate the conversational capabilities of Amazon Q Business into dynamic, context-aware dialogues. The agent maintains conversational context across interactions by storing short-term memory, enabling natural follow-up questions without requiring users to restate previous context. Additionally, when the agent encounters multiple possible answers based on your enterprise data, it asks clarifying questions to disambiguate the query to better understand what you’re looking for to provide more accurate responses. For instance, Q refers to any of the many implementations of Amazon Q. The system handles semantic ambiguity gracefully by recognizing multiple potential interpretations of what Q could be and asks for clarifications in its responses to verify accuracy and relevance. This sophisticated approach to dialogue management makes complex tasks like policy interpretation or technical troubleshooting more efficient, because the system can progressively refine its understanding through targeted clarification and follow-up exchanges.

In the following image, the user asks tell me about Q with the system providing a high-level overview of the various implementations and asking a follow-up question to disambiguate the user’s search intent.

Upon successful disambiguation, the system persists both the conversation state and previously retrieved contextual data in-memory, enabling the generation of precisely targeted responses that align with the user’s clarified intent thus being more accurate, relevant, and complete.

Agentic response optimization

Agentic RAG introduces dynamic response optimization where AI agents actively evaluate and refine their responses. Unlike traditional systems that provide answers even when the context is insufficient, these agents continuously assess response quality and iteratively plan out new actions to improve information completeness. They can recognize when initial retrievals miss crucial information and autonomously initiate additional searches or alternative retrieval strategies. This means when discussing complex topics like compliance policies, the system captures all relevant updates, exceptions, and interdependencies while maintaining context across multiple turns of the conversation. The following diagram shows how Agentic RAG handles the conversation history across multiple turns of the conversation. The agent plans and reasons across the retrieval tool use and response generation process. Based on the initial retrieval, while taking into account the conversation state and history, the agent re-plans the process as needed to generate the most complete and accurate response for the user’s query.

Using the Agentic RAG feature

Getting started with Agentic RAG’s advanced capabilities in Amazon Q Business is straightforward and can immediately improve how your organization interacts with your enterprise data. To begin, in the Amazon Q Business web interface, you can switch on the Advanced Search toggle to enable Agentic RAG, as shown in the following image.

After advanced search is enabled, users can experience richer and more complete responses from Amazon Q Business. Agentic RAG particularly shines when handling complex business scenarios based on your enterprise data—imagine asking about cross-AWS Region performance comparisons, investigating policy implications across departments, or analyzing historical trends in project deliveries. The system excels at breaking down these complex queries into manageable search tasks while maintaining context throughout the conversation.

For the best experience, users should feel confident in asking detailed, multi-part questions. Unlike traditional search systems, Agentic RAG handles nuanced queries like

How have our metrics changed across the southeast and northeast regions in 2024?

The system will work through such questions methodically, showing its progress as it analyzes and breaks the query down into composite parts to fetch sufficient context and generate a complete and accurate response.

Conclusion

Agentic RAG represents a significant leap forward for Amazon Q Business, transforming how organizations use their enterprise data while maintaining the robust security and compliance that they expect with AWS services. Through its sophisticated query processing and contextual understanding, the system enables deeper, more nuanced interactions with enterprise data—from comparative and multi-step queries to interactive multi-turn chat experiences. All of this occurs within a secure framework that respects existing permissions and access controls, making sure that users receive only authorized information while maintaining the rich, contextual responses needed for meaningful insights.

By combining advanced retrieval capabilities with intelligent, conversation-aware interactions, Agentic RAG allows organizations to unlock the full potential of their data while maintaining the highest standards of data governance. The result is an improved chat experience and a more capable query answering engine that maximizes the value of your data assets.

Try out Amazon Q Business for your organization with your data and share your thoughts in the comments.


About the authors

Sanjit Misra is a technical product leader at Amazon Web Services, driving innovation on Amazon Q Business, Amazon’s generative AI product. He leads product development for core Agentic AI features that enhance accuracy and retrieval — including Agentic RAG, conversational disambiguation, tabular search, and long-context retrieval. With over 15 years of experience across product and engineering roles in data, analytics, and AI/ML, Sanjit combines deep technical expertise with a track record of delivering business outcomes. He is based in New York City.

Venky Nagapudi is a Senior Manager of Product Management for Amazon Q Business. His focus areas include RAG features, accuracy evaluation and enhancement, user identity management and user subscriptions.

Yi-An Lai is a Senior Applied Scientist with the Amazon Q Business team at Amazon Web Services in Seattle, WA. His expertise spans agentic information retrieval, conversational AI systems, LLM tool orchestration, and advanced natural language processing. With over a decade of experience in ML/AI, he has been enthusiastic about developing sophisticated AI solutions that bridge state-of-the-art research and practical enterprise applications.

Yumo Xu is an Applied Scientist at AWS, where he focuses on building helpful and responsible AI systems for enterprises. His primary research interests are centered on the foundational challenges of machine reasoning and agentic AI. Prior to AWS, Yumo received his PhD in Natural Language Processing from the University of Edinburgh.

Danilo Neves Ribeiro is an Applied Scientist on the Q Business team based in Santa Clara, CA. He is currently working on designing innovative solutions for information retrieval, reasoning, language model agents, and conversational experience for enterprise use cases within AWS. He holds a Ph.D. in Computer Science from Northwestern University (2023) and has over three years of experience working as an AI/ML scientist.

Kapil Badesara is a Senior Machine Learning Engineer on AWS Q Business, focusing on optimizing RAG systems for accuracy and efficiency. Kapil is based out of Seattle and has more than 10 years of building large scale AI/ML services.

Sunil Singh is an Engineering Manager on the Amazon Q Business team, where he leads the development of next-generation agentic AI solutions designed to enhance Retrieval-Augmented Generation (RAG) systems for greater accuracy and efficiency. Sunil is based out of Seattle and has more than 10 years of experience in architecting secure, scalable AI/ML services for enterprise-grade applications.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Q Business Agentic RAG 生成式AI 企业数据 检索增强生成
相关文章