ByteByteGo 09月10日
Netflix Tudum架构演进:从CQRS+Kafka到RAW Hollow
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Netflix的Tudum平台采用CQRS架构通过Kafka连接写路径和读路径,但最终因编辑预览延迟问题转向RAW Hollow内存对象存储。新架构消除了外部键值存储、缓存刷新和Kafka依赖,将读路径从网络管道转变为内存本地查找,显著提升编辑预览速度(从1.4秒降至0.4秒)和系统简洁性,同时保持数据一致性。

📊 Tudum平台早期采用CQRS架构分离编辑写路径和访客读路径,通过Kafka传递数据,但存在编辑预览延迟问题。编辑保存更新后,内容需经过CMS通知、Tudum摄入层处理、Kafka发布、数据服务消费者处理、Cassandra存储和缓存刷新等一系列链式操作,导致预览延迟。

⏱️ 延迟问题源于读路径的缓存刷新机制:每个页面元素独立定时刷新,随着页面元素增多,刷新完成时间交错拉长,导致编辑预览时部分元素为旧数据。这种最终一致性设计虽适合生产流量,却严重拖慢编辑工作流。

💾 Netflix工程师引入RAW Hollow内存对象存储解决瓶颈:将全部读优数据加载至服务实例内存,实现O(1)时间访问,无需外部存储查询和缓存刷新。RAW Hollow特性包括分布式内存存储、数据压缩(三年数据约130MB)、按需一致性控制(支持强一致性预览请求)和内存计算优化。

🔄 新架构下,内容更新后直接加载至RAW Hollow内存,服务实例间同步更新,编辑预览和访客请求均即时响应。系统从‘存储→获取→缓存→刷新’的链式操作转变为‘内存加载→即时服务→同步更新’的内存本地模式,大幅简化架构并提升性能。

Build MCP servers that perform with Gram by Speakeasy (Sponsored)

AI agents get confused by MCP servers which include too many tools, lack crucial context, and are simple API mirrors. Without development, your carefully designed APIs causes agent headaches.

Gram fixes this. It's an open source platform where you can curate tools: add context, design multi-step tools, and deploy your MCP server in minutes.

Transform your APIs into agent-ready infrastructure that is ready to scale with OAuth 2.1 support, centralized management, and hosted infrastructure.

Start building your MCP server →


Disclaimer: The details in this post have been derived from the official documentation shared online by the Netflix Engineering Team. All credit for the technical details goes to the Netflix Engineering Team.  The links to the original articles and sources are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them.

When Netflix launched Tudum as its official home for behind-the-scenes stories, fan interviews, and interactive experiences, the engineering challenge was clear: deliver fresh, richly formatted content to millions of viewers at high speed, while giving editors a seamless way to preview updates in real time. 

The initial architecture followed a classic CQRS (Command Query Responsibility Segregation) pattern, separating the “write path” for editorial tools from the “read path” for visitors. Kafka connected these paths, pushing read-optimized data into backend services for page construction. 

The approach was scalable and reliable, but not without trade-offs.

As Tudum grew, editors noticed a frustrating delay between saving an update and seeing it live in previews. The culprit was a chain of sequential processes and cache refresh cycles that, while suitable for production visitors, slowed down the creative workflow. 

To solve this, Netflix engineers replaced the read-path’s external key-value store and per-request I/O with RAW Hollow: a compressed, distributed, in-memory object store embedded directly in the application. 

The result was near-instant editorial preview, simpler infrastructure, and a major drop in page construction time for end users. In this article, we will look at the evolution of this design decision and how Netflix went about implementing it.

Early Design

Netflix’s Tudum platform had to support two fundamentally different workflows:

To keep these workflows independent and allow each to scale according to its needs, Netflix adopted a CQRS (Command Query Responsibility Segregation) architecture. 

See the diagram below for a general overview of CQRS.

The write store contains the raw editorial data (internal CMS objects with IDs, metadata, and references), while the read store contains a fully “render-ready” version of the same data, such as resolved movie titles instead of IDs, CDN-ready image URLs instead of internal asset references, and precomputed layout elements.

As mentioned, Kafka served as the bridge between the two paths. When an editor made a change, the CMS emitted an event to Tudum’s ingestion layer. This ingestion pipeline performed the following steps:

The processed content was published to a Kafka topic.

A Data Service Consumer subscribed to this topic, reading each new or updated page element. It wrote this data into a Cassandra-backed read store, structured for fast retrieval. Finally, an API layer exposed these read-optimized entities to downstream consumers such as the Page Construction Service (which assembles full pages for rendering), personalization services, and other internal tools.

See the diagram below:

This event-driven design ensured that editorial changes would eventually appear on the Tudum website without impacting write-side performance, while also allowing Netflix to scale the read and write paths independently.

The Pain of Eventual Consistency

While the CQRS-with-Kafka design was robust and scalable, it introduced a workflow bottleneck that became increasingly visible as Tudum’s editorial output grew.

Every time an editor made a change in the CMS, that change had to travel through a long chain before it appeared in a preview environment or on the live site. Here is a quick look at the various steps involved:

This near-cache was a key contributor to the delay. Technically speaking, the near-cache is a small, per-instance, memory layer that sits in front of the read store. However, rather than refreshing instantly for every update, it operated on a scheduled per-key refresh policy. Each key has a timer. When the timer fires, the instance refreshes that key from the backing store.  While this approach was designed for production traffic efficiency, it meant that fresh edits often waited for the next scheduled refresh cycle before appearing.

As content volume and the number of page elements increased, these refresh cycles stretched longer. A page is assembled from multiple fragments, each with its key and timer. They do not refresh together. This meant that the more elements a page had, the more staggered the refresh completion became, leading to inconsistent preview states. In other words, some elements got updated, but others remained stale.

The result was that editors had to sometimes wait minutes to see their changes reflected in a preview, even though the system had already processed and stored the update. 

For a platform like Tudum, where timing is critical for publishing stories tied to new releases and events, this delay disrupted editorial flow and complicated collaboration between writers, editors, and designers.

The Solution: RAW Hollow

To eliminate the bottlenecks in Tudum’s read path, Netflix engineers turned to RAW Hollow: a compressed, distributed, in-memory object store designed for scenarios where datasets are small-to-medium in size, change infrequently, and must be served with extremely low latency.

Unlike the earlier setup, where read services fetched data from an external Cassandra-backed key-value store (with network calls, cache layers, and refresh cycles), RAW Hollow keeps the entire dataset loaded directly into the memory of every application instance that needs it. This means all lookups happen in-process, avoiding the I/O and cache-invalidation complexities of the old approach.

The key characteristics of RAW Hollow in the Tudum context are as follows:

For Tudum, adopting RAW Hollow meant removing the Page Data Service, its near-cache, the external key-value store, and even Kafka from the read path. Instead, the Hollow client was embedded directly inside each microservice that needed content. This collapsed the number of sequential operations, tightened the feedback loop for editors, and simplified the architecture by removing multiple moving parts.

The result was a big shift: instead of “store to fetch to cache to refresh,” the system now operates on “load once into memory to serve instantly to propagate changes.” 

The New TUDUM Design

After adopting RAW Hollow, Netflix rebuilt Tudum’s read path to remove the layers that were slowing down editorial previews and adding unnecessary complexity. 

The new design still follows the CQRS principle (separating the editorial content creation from the visitor-facing content), but the way data moves through the read side is now radically simplified.

See the diagram below:

Here’s what changed in the architecture:

The new flow works as follows:

This re-architecture shifted Tudum’s read path from a multi-hop network-bound pipeline to a memory-local lookup model. In essence, Netflix took the scalability and separation of CQRS but stripped away the read path’s I/O-heavy plumbing, replacing it with a memory-first, embedded data model.

Conclusion

The shift from a Kafka and Cassandra with a cache-based read path to a RAW Hollow in-memory model produced immediate and measurable improvements for Tudum. Some of the key benefits were as follows:

Netflix’s re-architecture of Tudum’s read path shows how rethinking data access patterns can yield outsized gains in performance, simplicity, and developer experience.

By combining the scalability of CQRS with the speed of an in-memory, compressed object store like RAW Hollow, they created a system that serves both editorial agility and end-user responsiveness.

The lessons here are broadly applicable:

References:


ByteByteGo Technical Interview Prep Kit

Launching the All-in-one interview prep. We’re making all the books available on the ByteByteGo website.

What's included:

Launch sale: 50% off


SPONSOR US

Get your product in front of more than 1,000,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Netflix Tudum CQRS架构 RAW Hollow 内存对象存储 架构演进 编辑预览优化 系统性能
相关文章