ByteByteGo 10小时前
LinkedIn构建通用人工智能平台以加速产品创新
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

LinkedIn在过去两年中积极转型,从最初的少量生成式人工智能功能发展到全面的平台战略。为了在整个生态系统中高效且负责任地扩展GenAI的应用,LinkedIn意识到早期孤立的实验方法阻碍了迭代和一致性。因此,工程团队构建了一个统一的GenAI应用栈,作为未来所有AI驱动计划的基础。该平台旨在提高开发速度,避免重复构建基础架构,并内置信任、隐私和安全保障。本文探讨了LinkedIn如何通过语言和框架的转变、统一的提示管理、技能抽象、内存管理以及模型推理和微调等方面,构建起这一强大的平台,并最终发展到支持更复杂的AI代理。

💻 语言和框架的转变:LinkedIn为了解决Java和Python在GenAI开发中的不一致性,将Python确立为首选语言,并采用LangChain作为主要的GenAI应用框架,同时构建内部工具以确保Python在生产环境中的可靠性。

📝 统一的提示管理:为了克服早期硬编码提示的混乱和管理难题,LinkedIn构建了一个Prompt Source of Truth系统,集中存储和管理使用Jinja模板语言的提示,实现了版本控制、模块化和一致性。

🛠️ 技能抽象与注册:LinkedIn引入了“技能”的概念,作为LLM可以使用的API或工具的能力抽象,并创建了一个技能注册中心,供团队定义、注册和发现技能,从而避免了重复开发和版本漂移,并通过治理层确保生态系统的健康。

🧠 内存管理:LinkedIn设计了两种互补的内存系统:对话记忆(存储原始交互历史以保持上下文)和经验记忆(捕获用户交互中导出的信号以实现个性化),尽管事后来看,团队认为统一的认知记忆系统可能更优。

⚙️ 模型推理与微调:LinkedIn最初使用Azure OpenAI托管模型,并通过中央代理实施安全和配额管理。随着需求增长,公司投资了自己的AI平台,基于PyTorch等技术微调开源模型,并通过兼容OpenAI API的统一接口实现外部和内部模型的无缝切换。

🚀 从助手到AI代理:LinkedIn的GenAI平台成熟后,开始支持更复杂的AI代理,这些代理能够分解用户意图、规划任务、选择工具并执行计划,同时通过人机协作控制确保效率与责任的结合。LinkedIn还利用其现有的消息传递基础设施作为代理编排的骨干。

Which LLM is best for your SLDC? (Sponsored)

Think the newest LLM is best for coding? Sonar put leading SOTA models to the test and unveiled insights on GPT-5, Claude Sonnet 4, and Llama 3 in their new expanded analysis. The results may surprise you.

Discover details on:

Download the report today or watch the on-demand webinar for insights to manage your AI coding strategy for scale, quality, and security.

Get the facts


Note: This article is written in collaboration with the engineering team of LinkedIn. Special thanks to Karthik Ramgopal, a Distinguished Engineer from the LinkedIn engineering team, for helping us understand the GenAI architecture at LinkedIn. All credit for the technical details and diagrams shared in this article goes to the LinkedIn Engineering Team.

Over the past two years, LinkedIn has undergone a rapid transformation in how it builds and ships AI-powered products.

What began with a handful of GenAI features, such as collaborative articles, AI-assisted Recruiter capabilities, and AI-powered insights for members and customers, has evolved into a comprehensive platform strategy that now involves multiple products. One of the most prominent examples of this shift is Hiring Assistant, LinkedIn’s first large-scale AI agent for recruiters, designed to help streamline candidate sourcing and engagement.

Behind these product launches was a clear motivation to scale GenAI use cases efficiently and responsibly across LinkedIn’s ecosystem.

Early GenAI experiments were valuable but siloed because each product team built its own scaffolding for prompts, model calls, and memory management. This fragmented approach made it difficult to maintain consistency, slowed iteration, and risked duplicating efforts. As adoption grew, the engineering organization recognized the need for a unified GenAI application stack that could serve as the foundation for all future AI-driven initiatives.

The goals of this platform were as follows:

In this article, we look at how LinkedIn’s engineering teams met these goals while building the GenAI setup. We’ll explore how they transitioned from early feature experiments to a robust GenAI platform, and eventually to multi-agent systems capable of reasoning, planning, and collaborating at scale. Along the way, we’ll highlight architectural decisions, platform abstractions, developer tooling, and real insights from the team behind the stack.

Foundations of the GenAI Application Stack

Before LinkedIn could build sophisticated AI agents, it had to establish the right engineering foundations.

During 2023 and 2024, LinkedIn focused on creating a unified GenAI application stack that could support fast experimentation while remaining stable and scalable. GenAI is a branch of Deep Learning that helps generate content like text, images, or code based on input. The diagram below shows how it relates to other areas like AI and Machine Learning.

1 - The Language and Framework Shift

When LinkedIn started shipping GenAI features, the engineering stack was fragmented.

Online production systems were almost entirely written in Java, while offline GenAI experimentation, including prompt engineering and model evaluation, was taking place in Python. This created constant friction. Engineers had to translate ideas from Python into Java to deploy them, which slowed experimentation and introduced inconsistencies.

To solve this, LinkedIn made a decisive shift to use Python as a first-class language for both offline and online GenAI development. There were three main reasons behind this move:

After internal discussions and some healthy debate, the team created prototypes that demonstrated Python’s velocity advantages. These early wins helped build confidence across the organization, even among teams that had spent years in the Java ecosystem.

However, moving to Python did not mean rewriting all infrastructure at once. Instead, LinkedIn focused on incremental steps, such as:

To make Python a reliable choice for production systems, LinkedIn invested in developer tooling:

These steps ensured that Python development at LinkedIn felt natural and productive rather than forced.

Finally, LinkedIn adopted LangChain as its primary GenAI application framework. LangChain provided a structured way to build applications that use large language models, manage prompts, and call external tools.

However, this choice was not made lightly. The team conducted a comparative evaluation of multiple frameworks, including AutoGen, LlamaIndex, CrewAI, and even the possibility of building an in-house solution.

The evaluation focused on several criteria such as ecosystem maturity, developer velocity, integration flexibility, production reliability, and the ability to keep up with the fast-moving GenAI landscape. LangChain stood out because it offered a strong ecosystem, active community support, and solid abstractions for prompt and tool orchestration, all of which allowed LinkedIn to move quickly without being locked in.

To reduce long-term switching costs, LinkedIn intentionally kept its abstraction layer thin. They wrapped LangChain with internal logging, instrumentation, and storage infrastructure, producing a stable internal library shared across teams. This ensured that if a better framework emerged in the future, migration would be manageable.

2 - Prompt Management

In the earliest GenAI experiments, prompts were simply hardcoded strings in the application code. For example, a recruiter assistant might have a long text prompt written directly in a Java class. This worked for simple cases, but quickly became messy. Managing multiple versions, ensuring prompt quality, and enforcing responsible AI guidelines was difficult.

To address this, LinkedIn built a Prompt Source of Truth system.

Instead of scattering prompts across codebases, prompts were stored and managed centrally using the Jinja templating language. This allowed developers to write prompts with placeholders and expressions, which could be dynamically filled at runtime.

The Prompt Source of Truth introduced several benefits:

As conversational interfaces became more common, LinkedIn aligned with the industry standard OpenAI Chat Completions API for handling multi-turn conversations. This made it easier to structure interactions between users and AI systems predictably.

3 - Skills

Prompts alone are not enough for complex applications. Many GenAI use cases require calling APIs or tools to retrieve data or perform actions. LinkedIn introduced a Skills abstraction to handle this in a structured way.

A Skill represents a capability that an LLM can use, such as searching for posts, viewing a member’s profile, or querying analytics systems. Initially, each team wrapped these APIs in custom code. As adoption grew, this led to duplication and version drift.

To fix this, LinkedIn created a skill registry. This is a central service where teams define their skills once, along with a schema and documentation. The build phase plugins can automatically register these skills so that GenAI applications can discover and use them at runtime through LangChain tools. This way, instead of application teams defining which skills they needed, downstream systems define the skills they provide, and applications can then discover and invoke them. This reduced duplication and made it easier to evolve APIs over time.

Skill registry also includes a governance layer to prevent uncontrolled sprawl. During build time, automated similarity checks flag potential duplicates or overlaps. These are then reviewed by human reviewers who act as gatekeepers before a skill is deployed. Skills are segmented by team or use case to maintain clarity and avoid fragmentation. This combination of automated checks and human oversight has been important for keeping the skill ecosystem healthy as it scales.

4 - Memory

Language models do not have memory between calls. If we want them to remember what happened in previous interactions, the functionality needs to be built.

LinkedIn approached this with two complementary systems: Conversational Memory and Experiential Memory.

Conversational memory is the raw interaction history that allows agents to maintain context between turns. This memory stores the history of user interactions so that the model can maintain context across multiple turns. LinkedIn built this on top of its existing Messaging system, which already supports reliable message storage, retrieval, and synchronization across devices. On top of raw storage, they added semantic search and summarization so that only relevant pieces of history are fed back to the model. This was integrated with LangChain’s memory abstraction to make it easy for developers to use.

Experiential memory captures derived signals from user interactions. This could include preferred tone, default job locations, or notification channel preferences. For example, a recruiter might prefer a particular tone or notification channel. These insights are stored and reused to personalize future interactions. This type of memory is more structured and hierarchical, derived from user behavior over time. Since experiential memory contains more sensitive user-derived data, LinkedIn had to enforce strict privacy and policy governance around how it is stored and used.

In general, memory turned out to be one of the hardest parts of the stack to design. In hindsight, the team would have preferred a unified cognitive memory system from the start, rather than separate conversational and experiential layers, to speed up experimentation and reduce complexity.

5 - Model Inference and Fine-Tuning

Early GenAI applications at LinkedIn used models hosted through Azure OpenAI.

All traffic to these models flowed through a central GenAI proxy. This proxy enforced responsible AI checks, managed streaming responses for better user experience, and applied quota limits to ensure fair usage across teams.

As LinkedIn’s needs grew, the company invested in its own AI platform based on PyTorch, DeepSpeed, and vLLM. This platform allowed engineers to fine-tune open-source models like Llama for LinkedIn-specific tasks. In many cases, these fine-tuned models performed as well as or better than proprietary models but at lower cost and latency.

To make switching between external and internal models seamless, LinkedIn exposed all models through a single OpenAI-compatible API. This meant that application developers did not have to change their code when the underlying model changed. LinkedIn also gave members control over whether their data could be used for training and fine-tuning, aligning with the company’s privacy commitments.

6 - Migration at Scale

Rolling out this new GenAI stack required migrating existing Java-based applications to the new Python-based framework.

LinkedIn followed an incremental migration strategy:

An important part of this migration was upskilling engineers. Many senior engineers were Java experts but new to Python. LinkedIn paired them with more experienced Python developers to accelerate learning and reduce risk. This pairing model helped the organization adopt the new stack without stalling product development.

From Assistants to AI Agents

By 2025, LinkedIn’s GenAI stack had matured enough to support more than just conversational assistants.

The platform evolved to handle AI agents. This step required new abstractions for defining agents, orchestrating their workflows, ensuring reliability and observability, and maintaining strong security and privacy guarantees.

Let’s look at the various aspects of the same in detail.

1 - What’s an Agent?

In LinkedIn’s platform, an agent is a modular component that can take a user’s intent, break it down into sub-tasks, decide which tools or skills to use, and then execute a plan to deliver results. Unlike simple assistants, which respond to a single query in isolation, agents are capable of multi-step reasoning and coordinated actions.

See the diagram below that shows the anatomy of a GenAI agent.

However, full autonomy is not always desirable. LinkedIn deliberately designed agents to work with human-in-the-loop (HITL) controls. At critical decision points, such as sending candidate outreach messages or modifying search filters, agents pause and request human approval. This approach allows the system to combine the efficiency of automation with the judgment and accountability of human users, which is especially important in professional contexts like recruiting.

2 - Discoverability of AI Agents

To make agents first-class citizens in LinkedIn’s architecture, the team introduced a gRPC service schema to define each agent’s contract. This schema describes the inputs the agent accepts, the outputs it produces, and any configuration parameters.

Once defined, agents are registered via a build plugin into an evolving agent registry. Other agents and applications can then discover and call these registered agents without needing to know implementation details.

3 - Orchestrating Multi-Agent Workflows

One of the most important decisions LinkedIn made was to use its existing Messaging infrastructure as the backbone for agent orchestration.

The diagram below shows how the various components of the platform relate to each other.

LinkedIn Messaging already provides FIFO (first-in, first-out) guarantees, automatic retries, history lookup, and parallel threads, all of which are essential for reliable multi-agent coordination. Instead of building a new orchestration system from scratch, the team built libraries that translate between messaging events and gRPC calls. This allows developers to write normal agent logic while the orchestration layer takes care of message passing and state tracking behind the scenes.

Agents can use different response modes depending on the task:

Using messaging as the orchestration backbone was both a pragmatic and strategic decision. It avoided reinventing reliable delivery mechanisms and made debugging easier since message history could be replayed when diagnosing failures.

On the client side, supporting agentic workflows required new capabilities. LinkedIn added libraries for push notifications, cross-device state synchronization, and incremental streaming. This allows users to start a task on one device, receive updates on another, and stay informed if an agent continues working in the background.

There are also frontend API server endpoints that complement the messaging flows. These handle cases where a direct synchronous call is better suited, such as when the user is waiting for a quick response inside the UI.

4 - Observability and Continuous Improvement

Building agentic systems made observability even more critical. Developers needed ways to trace multi-step reasoning, tool calls, and messaging interactions to debug and improve performance.

See the diagram below that shows the observability workflows for LinkedIn’s agentic systems.

LinkedIn uses two complementary systems:

5 - Developer Tooling: Playground

To make agent development faster and safer, LinkedIn expanded its internal Playground.

The Playground lets developers prototype agents and skills quickly without deploying to production. They can inspect memory contents, test identity and authorization rules, and view real-time traces of execution.

This environment acts as a low-friction sandbox that encourages experimentation while maintaining platform consistency.

6 - Emerging Patterns

As more teams started building agents, LinkedIn observed several consistent design patterns:

These patterns are gradually shaping a shared architectural language across teams.

7 - Platform Guardrails and Interoperability

As agent complexity grew, security and privacy became even more critical.

LinkedIn enforces strict siloing between three layers: the Client Data Layer, Conversational and Experiential Memory, and the Agent Lifecycle Service. Interactions between these layers are governed by policy-driven interfaces, strong authentication and authorization, and auditable access to ensure responsible data use.

The platform supports both synchronous and asynchronous invocation paths. Synchronous calls provide lower latency for interactive experiences, while asynchronous paths are better for longer tasks that may not require immediate user feedback.

LinkedIn is also moving incrementally toward open protocols such as Model Context Protocol (MCP) for tool discovery and Agent-to-Agent (A2A) protocols for agent collaboration. MCP support is already partial, but A2A is still experimental.

See the diagram below that shows the difference between MCP and A2A on a high level:

Conclusion

LinkedIn’s journey from scattered GenAI experiments to a mature, agent-oriented platform offers valuable lessons for any engineering organization looking to scale AI. Rather than chasing flashy features, LinkedIn focused on foundational engineering decisions that made it possible to move fast without losing control.

Several practical lessons stand out from their experience:

These principles gave LinkedIn the confidence to move from simple assistants to full agentic systems.

A good example is Hiring Assistant, which applies agentic workflows to repetitive, context-heavy recruiter tasks. By automating candidate sourcing and outreach, the team has seen qualitative evidence of surfacing higher-quality candidates while allowing recruiters to focus on judgment and relationship building.

Also, one key takeaway from speaking to the LinkedIn engineering team was that AI does not replace engineers, but augments them. The most impactful contributors will be those who combine technical depth with strong leadership and communication skills. Building agentic systems is as much about coordinating people, teams, and strategy as it is about orchestrating LLMs and APIs.

References:


SPONSOR US

Get your product in front of more than 1,000,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

GenAI LinkedIn AI平台 LangChain AI代理
相关文章