AWS Machine Learning Blog 10月16日 02:40
Amazon Bedrock AgentCore:AI 代理的持久化记忆系统
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon Bedrock AgentCore 引入了强大的长期记忆系统,解决了 AI 代理在理解和记住用户交互方面的挑战。该系统通过多阶段流程,将原始对话数据转化为结构化、可搜索的知识。它能区分重要信息和日常闲聊,跨时间整合相关信息,并按时间顺序处理记忆。AgentCore 利用 LLM 进行记忆提取(包括语义记忆、用户偏好和摘要记忆),然后进行智能整合,解决冲突并减少冗余。该系统通过标记无效记忆而非删除,确保了信息的可追溯性。此外,它还支持自定义记忆策略和模型选择,以满足不同应用场景的需求。性能评估显示,该系统在压缩率和准确性方面取得了良好的平衡,能够高效地管理大量对话历史,为构建更智能、个性化的 AI 代理奠定了基础。

💡 **AI 代理的记忆挑战与 AgentCore 的解决方案**:构建能够记住用户交互的 AI 代理面临多重挑战,包括区分有意义的见解与日常闲聊,跨时间整合相关信息,以及处理不断增长的记忆库。Amazon Bedrock AgentCore 的长期记忆系统通过多阶段管道,将原始对话数据转化为结构化、可搜索的知识,以解决这些复杂问题,确保 AI 代理能够建立持久、有意义的用户关系。

🧠 **多阶段记忆处理流程**:AgentCore 的长期记忆系统包含两个核心阶段:记忆提取和记忆巩固。记忆提取利用大型语言模型(LLMs)分析对话内容,生成包括事实知识(语义记忆)、用户偏好和对话摘要在内的记忆记录。记忆巩固则通过检索、智能处理和向量存储更新,智能地合并相关信息,解决冲突,并最小化冗余,确保记忆库的连贯性和时效性。

⚙️ **灵活的记忆策略与自定义选项**:AgentCore 提供三种内置记忆策略:语义记忆(Facts and knowledge)、用户偏好(Explicit and implicit preferences)和摘要记忆(Running narratives)。开发者还可以通过自定义提示覆盖内置逻辑,或者使用自管理策略完全控制提取和巩固流程,以适应特定的应用需求和领域。此外,还支持自定义模型选择,以平衡准确性和延迟。

🚀 **性能与最佳实践**:通过在多个基准数据集上的评估,AgentCore 的长期记忆系统在压缩率(高达 95%)和准确性之间取得了良好的权衡,显著降低了存储开销和推理成本。为了最大化其有效性,建议选择合适的记忆策略、设计有意义的命名空间、监控巩固模式,并为异步处理做好规划,确保 AI 代理能够高效、个性化地与用户互动。

Building AI agents that remember user interactions requires more than just storing raw conversations. While Amazon Bedrock AgentCore short-term memory captures immediate context, the real challenge lies in transforming these interactions into persistent, actionable knowledge that spans across sessions. This is the information that transforms fleeting interactions into meaningful, continuous relationships between users and AI agents. In this post, we’re pulling back the curtain on how the Amazon Bedrock AgentCore Memory long-term memory system works.

If you’re new to AgentCore Memory, we recommend reading our introductory blog post first: Amazon Bedrock AgentCore Memory: Building context-aware agents. In brief, AgentCore Memory is a fully managed service that enables developers to build context-aware AI agents by providing both short-term working memory and long-term intelligent memory capabilities.

The challenge of persistent memory

When humans interact, we don’t just remember exact conversations—we extract meaning, identify patterns, and build understanding over time. Teaching AI agents to respond the same requires solving several complex challenges:

Solving these problems requires sophisticated extraction, consolidation, and retrieval mechanisms that go beyond simple storage. Amazon Bedrock AgentCore Memory tackles these complexities by implementing a research-backed long-term memory pipeline that mirrors human cognitive processes while maintaining the precision and scale required for enterprise applications.

How AgentCore long-term memory works

When the agentic application sends conversational events to AgentCore Memory, it initiates a pipeline to transform raw conversational data into structured, searchable knowledge through a multi-stage process. Let’s explore each component of this system. 

1. Memory extraction: From conversation to insights

When new events are stored in short-term memory, an asynchronous extraction process analyzes the conversational content to identify meaningful information. This process leverages large language models (LLMs) to understand context and extract relevant details that should be preserved in long-term memory. The extraction engine processes incoming messages alongside prior context to generate memory records in a predefined schema. As a developer, you can configure one or more Memory strategies to extract only the information types relevant to your application needs. The extraction process supports three built-in memory strategies:

For each strategy, the system processes events with timestamps for maintaining the continuity of context and conflict resolution. Multiple memories can be extracted from a single event, and each memory strategy operates independently, allowing parallel processing.

2. Memory consolidation

Rather than simply adding new memories to existing storage, the system performs intelligent consolidation to merge related information, resolve conflicts, and minimize redundancies. This consolidation makes sure the agent’s memory remains coherent and up to date as new information arrives.

The consolidation process works as follows:

    Retrieval: For each newly extracted memory, the system retrieves the top most semantically similar existing memories from the same namespace and strategy. Intelligent processing: The new memory and retrieved memories are sent to the LLM with a consolidation prompt. The prompt preserves the semantic context, thus avoiding unnecessary updates (for example, “loves pizza” and “likes pizza” are considered essentially the same information). Preserving these core principles, the prompt is designed to handle various scenarios:
    You are an expert in managing data. Your job is to manage memory store. Whenever a new input is given, your job is to decide which operation to perform.Here is the new input text.TEXT: {query}Here is the relevant and existing memoriesMEMORY: {memory}You can call multiple tools to manage the memory stores...

    Based on this prompt, the LLM determines the appropriate action:

      ADD: When the new information is distinct from existing memories UPDATE: Enhance existing memories when the new knowledge complements or updates the existing memories NO-OP: When the information is redundant
    Vector store updates: The system applies the determined actions, maintaining an immutable audit trail by marking the outdated memories as INVALID instead of instantly deleting them.

This approach makes sure that contradictory information is resolved (prioritizing recent information), duplicates are minimized, and related memories are appropriately merged.

Handling edge cases

The consolidation process gracefully handles several challenging scenarios:

Advanced custom memory strategy configurations

While built-in memory strategies cover common use cases, AgentCore Memory recognizes that different domains require tailored approaches for memory extraction and consolidation. The system supports built-in strategies with overrides for custom prompts that extend the built-in extraction and consolidation logic, letting teams adapt memory handling to their specific requirements. To maintain system compatibility and focus on criteria and logic rather than output formats, custom prompts help developers customize what information gets extracted or filtered out, how memories should be consolidated, and how to resolve conflicts between contradictory information.

AgentCore Memory also supports custom model selection for memory extraction and consolidation. This flexibility helps developers balance accuracy and latency based on their specific needs. You can define them via the APIs when you create the memory_resource as a strategy override or via the console (as shown below in the console screenshot).

Apart from override functionality, we also offer self-managed strategies that provide complete control over your memory processing pipeline. With self-managed strategies, you can implement custom extraction and consolidation algorithms using any models or prompts while leveraging AgentCore Memory for storage and retrieval. Also, using the Batch APIs, you can directly ingest extracted records into AgentCore Memory while maintaining full ownership of the processing logic.

Performance characteristics

We evaluated our built-in memory strategy across three public benchmarking datasets to assess different aspects of long-term conversational memory:

We use two standard metrics: correctness and compression rate. LLM-based correctness evaluates whether the system can correctly recall and use stored information when needed. Compression rate is defined as output memory token count / full context token count, and evaluates how effectively the memory system stores information. Higher compression rates indicate the system maintains essential information while reducing storage overhead. This compression rate directly translates to faster inference speeds and lower token consumption–the most critical consideration for deploying agents at scale because it enables more efficient processing of large conversational histories and reduces operational costs.

Memory Type Dataset Correctness Compression Rate
RAG baseline
(full conversation history)
LoCoMo 77.73% 0%
LongMemEval-S 75.2% 0%
PrefEval 51% 0%
Semantic Memory LoCoMo 70.58% 89%
LongMemEval-S 73.60% 94%
Preference Memory PrefEval 79% 68%
Summarization PolyBench-QA 83.02% 95%

The retrieval-augmented-generation (RAG) baseline performs well on factual QA tasks due to complete conversation history access, but struggles with preference inference. The memory system achieves strong practical trade-offs: though information compression leads to slightly lower correctness on some factual tasks, it provides 89-95% compression rates for scalable deployment, maintaining bounded context sizes, and performs effectively at their specialized use cases.

For more complex tasks requiring inference (understanding user preferences or behavioral patterns), memory demonstrates clear advantages in both performance accuracy and storage efficiency—the extracted insights are more valuable than raw conversational data for these use cases.

Beyond accuracy metrics, AgentCore Memory delivers the performance characteristics necessary for production deployment.

These latency characteristics, combined with the high compression rates, enable the system to maintain responsive user experiences while managing extensive conversational histories efficiently across large-scale deployments.

Best practices for long-term memory

To maximize the effectiveness of long-term memory in your agents:

Conclusion

The Amazon Bedrock AgentCore Memory long-term memory system represents a significant advancement in building AI agents. By combining sophisticated extraction algorithms, intelligent consolidation processes, and immutable storage designs, it provides a robust foundation for agents that learn, adapt, and improve over time.

The science behind this system, from research-backed prompts to innovative consolidation workflow, makes sure that your agents don’t just remember, but understand. This transforms one-time interactions into continuous learning experiences, creating AI agents that become more helpful and personalized with every conversation.

Resources:
AgentCore Memory Docs
AgentCore Memory code samples
Getting started with AgentCore – Workshop


About the authors

Akarsha Sehwag is a Generative AI Data Scientist for Amazon Bedrock AgentCore GTM team. With over six years of expertise in AI/ML, she has built production-ready enterprise solutions across diverse customer segments in Generative AI, Deep Learning and Computer Vision domains. Outside of work, she likes to hike, bike or play Badminton.

Jiarong Jiang is a Principal Applied Scientist at AWS, driving innovations in Retrieval-Augmented Generation (RAG) and agent memory systems to improve the accuracy and intelligence of enterprise AI. She’s passionate about enabling customers to build context-aware, reasoning-driven applications that leverage their own data effectively.

Jay Lopez-Braus is a Senior Technical Product Manager at AWS. He has over ten years of product management experience. In his free time, he enjoys all things outdoors.

Dani Mitchell is a Generative AI Specialist Solutions Architect at Amazon Web Services (AWS). He is focused on helping accelerate enterprises across the world on their generative AI journeys with Amazon Bedrock and Bedrock AgentCore.

Peng Shi is a Senior Applied Scientist at AWS, where he leads advancements in agent memory systems to enhance the accuracy, adaptability, and reasoning capabilities of AI. His work focuses on creating more intelligent and context-aware applications that bridge cutting-edge research with real-world impact.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Bedrock AgentCore AI 代理 长期记忆 人工智能 Amazon Bedrock AgentCore AI Agents Long-term Memory Artificial Intelligence Memory Management
相关文章