Nvidia Developer 前天 02:22
分布式用户面功能(dUPF)赋能6G AI原生网络
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了分布式用户面功能(dUPF)在推动电信行业迈向6G AI原生网络中的关键作用。dUPF将计算能力推向网络边缘,通过去中心化的分组处理和路由,实现超低延迟、高吞吐量和分布式AI工作负载的无缝集成。NVIDIA DOCA Flow的硬件加速能力进一步增强了dUPF的性能,使其成为构建6G AI基础设施的重要组成部分。文中介绍了dUPF的架构优势、在NVIDIA加速边缘基础设施上的优势,以及其在AR/VR、自动驾驶、视频搜索摘要等多种低延迟和AI密集型应用场景中的实际用例,并展示了基于NVIDIA Grace CPU和BlueField-3 DPU的dUPF用户面参考实现。

🚀 **dUPF是6G AI原生网络的核心组成部分**:分布式用户面功能(dUPF)通过将数据处理和路由能力移至网络边缘,打破了传统中心化网络的延迟瓶颈,为AI原生无线网络(AI-RAN)和AI核心(AI-Core)奠定了基础,是实现6G愿景的关键技术。

💡 **边缘计算与超低延迟**:dUPF将用户数据处理移至更靠近用户和无线节点的位置,显著降低了延迟,使其能够支持AR/VR、自动驾驶通信、远程手术和工业自动化等需要实时响应的下一代应用。

💻 **NVIDIA加速平台的优势**:结合NVIDIA AI Aerial平台和DOCA Flow的硬件加速能力,dUPF能够实现低至25微秒的延迟,降低运营成本,提高能源效率,并为AI原生服务和应用开启新的收入模式。

🔄 **广泛的应用场景**:dUPF的部署支持多种AI密集型工作负载和低延迟应用,例如视频搜索和摘要(VSS)、XR体验、游戏以及需要本地化人工智能推理的工业自动化场景,能有效减少骨干网络的数据负担。

The telecommunications industry is innovating rapidly toward 6G for both AI-native Radio Access Networks (AI-RAN) and AI-Core. The distributed User Plane Function (dUPF) brings compute closer to the network edge through decentralized packet processing and routing, enabling ultra-low latency, high throughput, and the seamless integration of distributed AI workloads. dUPF is becoming a crucial component in the evolution of mobile networks to be part of the foundational AI infrastructure.

This post explores the architectural advantages of dUPF at the telecom edge to enable agentic AI applications. It features a reference implementation of a dUPF user plane application built with NVIDIA DOCA Flow to leverage hardware-accelerated packet steering and processing. The demonstration highlights how the NVIDIA accelerated compute platform enables energy-efficient, low-latency user plane operations, reinforcing the essential role of dUPF in the 6G AI-Native Wireless Networks Initiative ( AI-WIN) full-stack architecture.

What is dUPF?

dUPF is a 3GPP 5G core network function, which handles user plane packet processing at distributed locations as defined in section 6.2.5 of 3GPP 5G core architecture and in section 4.2 of 3GPP 5G Mobile Edge Computing (MEC) architecture. dUPF moves user data processing closer to users and radio nodes. Unlike traditional UPFs that cause latency due to long backhaul routes, the dUPF handles traffic at the network edge, enabling real-time applications and local breakout for AI traffic through AI-specific local data networks (AI-DN), as shown in Figure 1.

Figure 1. dUPF in 3GPP multiple PDU Sessions MEC connectivity model  anchors AI-DN traffic at the distributed sites

How does dUPF work in the 6G AI-centric network?

6G aims to transform telecom operators into critical AI infrastructure, hosting AI factories and distributing AI inference as an AI grid. dUPF is a crucial aspect of this, enabling 6G distributed edge agentic AI and local breakout (LBO).

Next-generation applications like video search and summarization (VSS), XR, gaming, and industrial automation demand real-time, autonomous intelligence at the network edge, which traditional centralized wireless core architectures cannot provide.

This proximity offers several benefits:

    Ultra-low latency: Enables immediate responsiveness for mission-critical 6G use cases.Efficient data handling: Processes local data at the source, reducing latency and optimizing network resources.Enhanced data privacy and security: Localized processing minimizes sensitive data exposure, fostering trust.Decentralized compute for resilient AI: Distributes AI workloads, creating a robust, resilient infrastructure and eliminating single points of failure.

What are the benefits of dUPF on NVIDIA accelerated edge infrastructure?

NVIDIA AI Aerial platform is a suite of accelerated computing platforms, software, and services for designing, simulating, and operating wireless networks. The benefits of dUPF on AI Aerial edge infrastructure include:

    Ultra-low latency: Latency is as low as 25 microseconds with zero packet loss, improving user experience for edge AI inferencing.Cost reduction: Lower backhaul and OPEX through distributed processing and optimized resource utilization, reducing transport costs.Energy efficiency: NVIDIA DOCA Flow-enabled HW acceleration reduces CPU usage, freeing cores for AI applications on shared hardware, lowering power consumption.New revenue models: Enables AI-native services and applications requiring real-time edge data processing.Enhanced network performance: Improved scalability, jitter minimization, and deterministic behavior for AI and RAN traffic.
Figure 2. dUPF is a component of the NVIDIA AI Aerial platform application layer

The key value propositions of dUPF are fully aligned with the 6G AI-WIN initiative, making dUPF an integral part of the AI-WIN full stack. This initiative brings together T-Mobile, MITRE, Cisco, ODC, and Booz Allen Hamilton to develop an AI-native network stack for 6G, built on NVIDIA AI Aerial. 

dUPF use cases

Key use cases for dUPF include:

Ultra-low-latency applications: By hosting dUPF functions at the edge, data can be processed and routed locally, eliminating backhaul delays. This is critical for:

    AR/VR and real-time conversations with an AI agentVSSAutonomous vehicle and robot communications (V2X)Remote surgery and real-time industrial automation 

AI and data-intensive workloads at the edge: Integration of dUPF with AI-native platforms (such as NVIDIA Grace Hopper) enables real-time edge inferencing for applications like distributed AI RAN, agentic AI, and localized autonomous control. 

Figure 3 illustrates a VSS data processing ingestion pipeline, where camera streams are handled at the edge alongside the deployed dUPF for local breakout. By shifting inference tasks to the edge server, operators deliver low-latency services while significantly reducing the data load on their backbone networks.

 Figure 3. Camera and video streams can be offloaded to dUPF deployed in the edge for the VSS data processing block

dUPF user plane reference implementation

The dUPF user plane reference implementation is based on a decomposed architecture as illustrated in Figure 3, which comprises two key components, dUPF-UP and dUPF-CP:

dUPF-UP: This component is responsible for user plane packet processing accelerated using DOCA Flow APIs, which handles essential UPF user plane functionalities:

    Packet Detection Rule (PDR)QoS Enforcement Rule (QER)Usage Report Rule (URR)Forwarding Action Rule (FAR) 

dUPF-CP: This component communicates with SMF over a 3GPP N4 interface and communicates with dUPF-UP through an internal messaging interface (gRPC) over CNI to facilitate user plane packet processing.

 Figure 4. dUPF reference architecture with dUPF-CP and dUPF-CP supporting 3GPP standard interfaces (N3, N6, and N4)

The dUPF-UP is deployed on the NVIDIA accelerated Supermicro 1U Grace Hopper MGX System server platform with NVIDIA Grace CPU and NVIDIA BF3 DPU. AI-DN traffic is handled by dUPF-UP at the edge, and other user traffic (such as Internet traffic) is delivered to centralized UPF through the transport network.

dUPF-UP acceleration architecture and data flows

The NVIDIA Grace CPU Superchip and NVIDIA BlueField-3 (BF3) SuperNIC are key hardware for co-hosted RAN and dUPF-UP. Figure 5 illustrates dUPF-UP packet processing.

 Figure 5. dUPF-UP application on an NVIDIA Grace CPU host with packet processing accelerated by BF3 HW pipelines

The Grace CPU Superchip, with 72 Arm Neoverse V2 cores, uses the NVIDIA Scalable Coherency Fabric (SCF) to achieve a 3.2 TB/s bandwidth. This boosts dUPF user plane packet processing performance and energy efficiency. The BF3 SuperNIC accelerates dUPF data plane functions through DOCA Flow pipelines, including: 

    Packet classification (5-tuples, DSCP/VLAN, GTP TEID/QFI)GTP encapsulation/decapsulationMetering (AMBR/MBR)Counting (URR usage/quotas)Forwarding (fast path for direct forwarding, slow path for exception packets)Mirroring for host CPU processing (Lawful Intercept, for example)

dUPF-UP reference implementation with DOCA Flow

The dUPF-UP reference implementation accelerates AI traffic LBO through DOCA Flow, leveraging IP subnet-based Service Data Flow (SDF) classification and simplifying AI-DN deployment. Key simplifications include:

    Differentiating edge AI applications using IP subnet SDFAvoiding IP segmentation/reassembly by aligning MTUsSimplifying QoS and charging with the Packet Detection Rule (PDR)-based assurance 

dUPF-UP DOCA Flow pipelines are designed for N3 and N6 interfaces.

N3 interface DOCA Flow pipeline design

N3 interface uplink pipelines contain pipes as shown in Figure 5:

    GTP decap: Performs GTP header decapsulationCounter: Counts receiving packets for URR reportingPolicer QoS flow MBR: QER enforcement for QoS flow level MBRPolicer QoS  Session MBR: QER enforcement for session level MBRCounter: Counts packets post QER metering for URR reportingFAR (DSCP Marking): Performs DSCP marking and other FAR handlingForward: Forwards packet to N6 interface
Figure 6. dUPF-UP N3 uplink DOCA Flow pipelines

N6 interface DOCA Flow pipeline design

N6 interface downlink pipelines contain pipes as shown in Figure 7:

    GTP Decap: Performs GTP header decapsulationCounter: Counts receiving packets for URR reportingPolicer QoS Flow MBR: Performs QER enforcement for QoS flow level MBRPolicer QoS  Session MBR: Performs QER enforcement for session level MBRCounter: Counts packets post QER metering for URR reportingGTP Encap: Performs GTP header encapsulationFAR (DSCP Marking): Performs DSCP marking and other FAR handlingForward: Forwards packet to N6 interface
Figure 7. dUPF-UP N6 downlink DOCA Flow pipelines

To learn more about how to program Counter, Policer, GTP Encap, GTP Decap, FAR, and Forward pipes see the DOCA Flow Program Guide and  DOCA Flow Example Application Guide

dUPF-UP example implementation lab validation

dUPF-UP was tested on a  Supermicro 1U Grace Hopper MGX System server, using two dedicated CPU cores (core-0 and core-1). Core-0 managed control procedures for AI-DN session setup, while Core-1 handled slow path exception packets via Poll Mode Driver (PMD) mode. The dUPF-CP simulator initiated 60,000 UE sessions at 1,000 sessions/second. After setup, user plane packets were sent over dual 100G links from a TRex traffic generator.

Observations include:

    Core-0 averaged under 7% CPU usage for control proceduresCore-1 showed 100% CPU usage due to PMD polling mode, but no exception packets were delivered to it as all user plane packets were handled by BF3BF3 NIC hardware accelerated all user plane packets, achieving 100 Gbps throughput with zero packet loss

Lab performance testing summary

Based on the performance lab testing, the dUPF-UP example implementation on Grace plus BF3 achieved 100 Gbps throughput (line rate of 100G links of the test setup) with zero packet loss. This demonstrates full hardware acceleration of user plane packet processing for AI traffic using an IP subnet SDF-based pipeline design. This was accomplished using only two Grace CPU cores. Archived functionalities and performance in lab testing validated the value propositions of dUPF-UP on the AI Aerial platform.

dUPF ecosystem adoption

Cisco embraces dUPF architecture, accelerated by the NVIDIA AI Aerial platform and the NVIDIA DOCA framework, as a cornerstone for 6G AI-centric networks. When combined with the AI-ready data center architecture, this enables telecom operators to deploy high-performance, energy-efficient dUPF with security infused and closely integrated AI inference extended to the network edge—opening the door to applications such as VSS, agentic AI, XR, and ultra-responsive AI-driven services.

“Software-defined DPU and GPU-accelerated edge infrastructure enable efficient deployment of Wireless RAN, Core, and AI applications, delivering superior user experiences and new monetization opportunities for service providers,” said Darin Kaufman, Head of Product, Cisco Mobility. “Together, Cisco and NVIDIA are building intelligent, secure, and energy-efficient edge networks that power the next generation of wireless connectivity.”  

Get started building and deploying AI-native networks

dUPF is a critical component for the 6G AI-centric network. By strategically deploying high-performance, ultra-low-latency, and energy-efficient dUPF accelerated on the NVIDIA AI Aerial platform with integrated AI inference at the network edge, operators can enable a new era of services. This dramatically lowers operational expenditures and ensures that the network infrastructure is agile and scalable enough to handle the immense demands of future AI-centric applications within a 6G network.

To get started, contact telco@nvidia.com to learn more about DOCA Flow hardware acceleration and the benefits of dUPF deployment on AI Aerial.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

dUPF 6G AI原生网络 边缘计算 NVIDIA AI-WIN 低延迟 电信 dUPF 6G AI-Native Networks Edge Computing NVIDIA AI-WIN Low Latency Telecommunications
相关文章