Nvidia Developer 09月03日
CUDA-Q QEC 0.4:加速量子纠错实验的新功能
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

NVIDIA最新发布的CUDA-QX 0.4版本,在量子纠错(QEC)领域带来了多项重要更新,旨在加速研究人员的QEC实验。新功能包括自动生成探测器误差模型(DEM),为QEC代码的定义和仿真提供了便利;引入了基于张量网络的解码器,提供精确且高效的解码方案,并在性能上与业界领先的实现持平;同时,对BP+OSD解码器进行了多项改进,增强了灵活性和监控能力,如自适应收敛监控、消息裁剪和BP算法选择等。此外,还新增了生成式量子特征求解器(GQE)的实现,为AI驱动的量子电路设计提供了新途径。这些更新将极大地推动量子计算在商业化和大规模应用方面的进展。

✨ **自动生成探测器误差模型(DEM)**:CUDA-QX 0.4版本新增了从指定的QEC电路和噪声模型自动生成DEM的功能。DEM是QEC工作流程中的关键组成部分,它描述了量子比特的物理错误及其在测量中的表现,能够用于模拟电路采样和配置解码器,避免了数据重复,简化了QEC代码的定义和仿真过程。

🚀 **张量网络解码器提升解码精度与效率**:新版本引入了支持Python的张量网络解码器,该解码器基于Tanner图,能够精确计算逻辑可观测量翻转的概率。其优点在于易于理解、无需训练且解码精度高,可作为研究基准。通过利用GPU加速的cuQuantum库,该解码器在性能上达到了与Google同类实现相当的水平,为研究人员提供了标准化的工具。

🔧 **BP+OSD解码器功能增强**:CUDA-QX 0.4对GPU加速的BP+OSD解码器进行了多项改进,包括自适应收敛监控,允许用户调整检查间隔以降低开销;消息裁剪功能,通过设置阈值防止数值不稳定;BP算法选择,用户可在sum-product和min-sum之间选择;以及动态缩放因子,优化min-sum算法性能。此外,还增强了结果监控能力,允许追踪BP解码过程中的对数似然比(LLR)演变。

💡 **生成式量子特征求解器(GQE)助力AI驱动设计**:新版本在Solvers库中加入了生成式量子特征求解器(GQE)的实现。GQE是一种混合算法,利用生成式AI模型寻找量子哈密顿量的特征态(特别是基态),有望解决传统变分量子特征求解器(VQE)面临的收敛问题,如“荒原现象”。其工作流程包括生成候选量子电路、评估性能和更新模型,为量子算法设计提供了新思路。

As quantum processor unit (QPU) builders and algorithm developers work to create large-scale, commercially viable quantum supercomputers, they are increasingly concentrating on quantum error correction (QEC). It represents the greatest opportunity and the biggest challenge in current quantum computing research.

CUDA-Q QEC aims to speed up researchers’ QEC experiments through the rapid creation of fully accelerated, end-to-end workflows—from defining and simulating novel codes with circuit-level noise models, to configuring realistic decoders and deploying them alongside physical QPUs. CUDA-Q QEC aims to provide each component in this workflow as user-definable through a comprehensive API. We built out key parts of this workflow in the CUDA-QX 0.4 release.

We walk you through the biggest new features in this blog. And see the complete release notes on GitHub, where you can also keep track of ongoing development, provide feedback, and contribute.

Generating a detector error model (DEM) from a memory circuit

The first step in a QEC workflow is defining a QEC code with an associated noise model.

QEC codes are ultimately implemented through stabilizer measurements, which are themselves noisy quantum circuits. The effective decoding of many stabilizer rounds requires knowledge of these circuits, the mapping of each measurement to a stabilizer (detector), and a prior estimate of the probability of every physical error that can occur in each circuit. The detector error model (DEM), originally developed as part of Stim (Quantum, 2021) and described in the paper Designing fault-tolerant circuits using detector error models (Arxiv, 2024), provides a useful way to describe this setup.

Figure 1. Diagram of the end-to-end CUDA-Q QEC workflow. The DEM is a single object used in both the simulation of circuit shots and the configuration of the decoder, avoiding duplication in these steps.

As of the CUDA-QX 0.4 release, you can automatically generate the DEM from a specified QEC circuit and noise model. The DEM can then be used for both circuit sampling in simulation and decoding the resulting syndromes using the standard CUDA-Q QEC decoder interface. For memory circuits, all necessary logic is already provided behind the CUDA-Q QEC API.

For more information on DEMs in CUDA-Q QEC, see the C++ API and Python API documentation and examples.

Tensor networks to enable exact maximum likelihood decoding

The use of tensor networks for QEC decoding offers several advantages in research. Relative to other algorithmic and AI decoders, tensor-network decoders are easy to understand. The tensor network for a code is based on its Tanner graph and can be contracted to compute the probability that a logical observable has flipped, given a syndrome. They are guaranteed to be accurate or even exact, and don’t require training (though they can benefit from it). And while they are often used as benchmarks in research, there is currently no open-access, go-to implementation in Python that researchers can use as a standard for tensor network decoding.

CUDA-QX 0.4 introduces a tensor network decoder with support for Python 3.11 onward. The decoder provides: 

    Flexibility: The only input required is a parity check matrix, a logical observable, and a noise model. This allows the users to decode different codes with circuit level noise.Accuracy: The tensor networks are contracted exactly. Therefore the decoder achieves the theoretical optimum decoding accuracy (see Figure 2 below).Performance: By exploiting the GPU-accelerated cuQuantum libraries, users can push the performance of contractions and path optimizations beyond what was previously possible.

In Figure 2 below, we plot the logical error rate (LER) of the CUDA-Q QEC tensor network decoder using exact contraction on the open source dataset from the paper Suppressing quantum errors by scaling a surface code logical qubit (Nature, 2023). All reference lines (Ref in the figure below) quote data from the paper Learning high-accuracy error decoding for quantum processors (Nature, 2024). We show LER parity with Google’s tensor network decoder with an open-source, GPU-accelerated implementation.

For more information on tensor network decoding in CUDA-Q QEC see the Python API documentation and examples.

Improvements to the BP+OSD decoder

CUDA-QX 0.4 introduces several improvements to its GPU-accelerated Belief Propagation + Ordered Statistics Decoding (BP+OSD) implementation, which provide enhanced flexibility and monitoring capabilities:

Adaptive convergence monitoring

Iter_per_check introduces configurable BP convergence checking intervals. Set to one iteration by default, this parameter can be increased to the maximum iteration limit set by the user to reduce overhead in scenarios where frequent convergence checks aren’t necessary.

Message clipping for numerical stability

clip_value addresses potential numerical instabilities in BP by implementing message clipping. This feature allows users to set a non-negative threshold value to prevent message values from growing excessively large, which can lead to overflow or precision issues. When set to 0.0 (default), clipping is disabled, maintaining backward compatibility. Note that clipping aggressively could potentially impact BP’s performance.

BP algorithm selection

bp_method provides users with a choice between two BP algorithms. sum-product provides a traditional approach, offering robust performance for most scenarios. And min-sum is a computationally efficient alternative that can provide faster convergence in certain cases.

Dynamic scaling for min-sum optimization

scale_factor enhances the min-sum algorithm with adaptive scaling capabilities. Users can specify a fixed scale factor (defaults to 1.0) or enable dynamic computation by setting it to 0.0, where the factor is automatically determined based on iteration count.  

Result monitoring

opt_results with bp_llr_history introduces logging capabilities that allow researchers and developers to track the evolution of log-likelihood ratios (LLR) throughout BP’s decoding process. Users can configure the history depth from 0 to the maximum iteration count. 

For complete information on CUDA-Q QEC’s BP+OSD decoder see the latest Python API or C++ API documentation and a full example.

A Generative Quantum Eigensolver (GQE) for AI-driven quantum circuit design

CUDA-QX 0.4 adds an out-of-the-box implementation of the Generative Quantum Eigensolver (GQE) to the Solvers library. This algorithm is the subject of ongoing research, especially with regard to the loss function. The current example provides a cost function suitable to small-scale simulation.

GQE is a novel hybrid algorithm for finding eigenstates (especially ground states) of quantum Hamiltonians using generative AI models. In contrast to the Variational Quantum Eigensolver (VQE), where the quantum program has a fixed parameterization, GQE shifts all program design into a classical AI model. This has the potential to alleviate convergence issues with traditional VQE approaches, such as barren plateaus.

Our implementation uses a transformer model following The generative quantum eigensolver (GQE) and its application for ground state search (Arxiv, 2024) and has been described in further detail in a previous NVIDIA technical blog, Advancing Quantum Algorithm Design with GPT.

The GQE algorithm performs the following steps:

    Initialize or load a pre-trained generative model.Generate candidate quantum circuits.Evaluate circuit performance on target Hamiltonian.Update the generative model based on results.Repeat generation and optimization until convergence.

For complete details on the Solvers implementation of GQE, see the Python API documentation and examples.

Conclusion

The CUDA-QX 0.4 release includes a variety of new features in both the Solvers and QEC libraries, including a new Generative Quantum Eigensolver (GQE) implementation, a new tensor network decoder, and a new API for auto-generating detector error models from noisy CUDA-Q memory circuits.

See the Github repository and the documentation for all the details.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

CUDA-Q 量子纠错 QEC 量子计算 DEM 张量网络 BP+OSD GQE AI CUDA-QX 0.4
相关文章