Nvidia Developer 09月03日
NVIDIA FourCastNet3:AI天气预报新突破
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

NVIDIA发布了最新的AI全球天气预报系统FourCastNet3 (FCN3),该系统在概率准确性、计算效率、谱保真度、集合校准和亚季节性稳定性方面均有显著提升。FCN3的中程预报准确性可媲美GenCast等领先的机器学习模型,并超越了IFS-ENS等传统数值天气预报系统。在NVIDIA H100 GPU上,FCN3仅需不到四分钟即可完成一次长达60天的预报,比GenCast快8倍,比IFS-ENS快60倍。该系统还具备出色的校准和谱保真度,即使在60天的长提前期内,集合成员也能保持真实的谱特性。FCN3采用了基于球形信号处理的卷积神经网络架构,并引入了基于扩散过程的隐马尔可夫模型来生成集合成员,实现了高效的单步生成。通过创新的模型并行化策略,FCN3能够扩展到1024个GPU进行训练,进一步提升了其性能和预测能力。FCN3代表了从中期到亚季节性时间尺度上,数据驱动的大集合天气预报的一次重大飞跃。

🚀 **卓越的预报性能与效率**:FourCastNet3 (FCN3) 在天气预报准确性上与GenCast等先进机器学习模型相当,并超越了IFS-ENS等传统数值模型。其计算效率极高,在NVIDIA H100 GPU上,一次60天的完整预报仅需不到4分钟,速度比GenCast快8倍,比IFS-ENS快60倍,极大地加速了天气预报的流程。

💡 **创新的架构与随机性引入**:FCN3采用了完全卷积的球形神经网络算子架构,结合了局部球形卷积和谱卷积,并使用Morlet小波参数化。它通过隐马尔可夫模型和扩散过程引入随机性,允许在每一步预测时通过潜噪声变量生成集合成员,这比基于扩散模型的方案更为高效。

🌐 **强大的扩展性与大规模训练**:为了实现模型扩展,FCN3引入了一种受传统数值天气建模域分解启发的模型并行化新范式。该方法允许模型跨多个设备进行分割,降低了每个设备的磁盘I/O,并支持在多达1024个GPU上使用并行化进行训练,显著提升了处理复杂天气模式的能力。

📊 **高保真度的集合预报**:FCN3的集合预报展现出优异的校准和谱保真度,即使在长达60天的提前期内,集合成员也能保持真实的谱特性,忠实再现能量级联和天气模式的清晰度。这确保了预测的可靠性和物理一致性,即使在长周期预报中也能提供准确、清晰的结果。

FourCastNet3 (FCN3) is the latest AI global weather forecasting system from NVIDIA Earth-2. FCN3 offers an unprecedented combination of probabilistic skill, computational efficiency, spectral fidelity, ensemble calibration, and stability at subseasonal timescales. Its medium-range forecasting accuracy matches that of leading machine learning models, such as GenCast, and exceeds that of traditional numerical weather prediction systems, such as IFS-ENS. 

A single 60-day FCN3 rollout with 0.25° and 6-hourly resolution is computed in under four minutes on a single NVIDIA H100 Tensor Core GPU—an 8x speedup over GenCast and a 60x speedup over IFS-ENS. 

It also has remarkable calibration and spectral fidelity, with ensemble members retaining realistic spectral properties even at extended lead times of 60 days. FCN3 demonstrates a significant leap towards data-driven weather prediction with large ensembles from medium-range to subseasonal timescales.

Figure 1. 2-week rollout of 15 FourCastNet3 ensemble members, displaying surface wind speeds during this period

FCN3 architecture 

FourCastNet3 employs a fully convolutional, spherical neural operator architecture,  based on spherical signal processing primitives (see Figure 2). Unlike FourCastNet2, which is based on the Spherical Fourier Neural Operator, FCN3 uses local spherical convolutions alongside spectral convolutions. 

These convolutions are parameterized using Morlet wavelets and formulated in the framework of discrete-continuous group convolutions. This approach enables anisotropic, localized filters well-suited to localized atmospheric phenomena, while also guaranteeing computational efficiency through a custom implementation in NVIDIA CUDA.

FCN3 introduces stochasticity at every predictive step through a latent noise variable whose evolution is governed by a diffusion process on the sphere. This hidden-Markov formulation enables efficient one-step generation of ensemble members—a key advantage over diffusion model-based approaches. FCN3 is trained jointly as an ensemble,  minimizing a composite loss function that combines the continuously ranked probability score (CRPS) in space and in the spectral domain. This approach ensures that FCN3 learns the correct spatial correlations in the underlying stochastic atmospheric processes.

Scaling ML models is often crucial to achieving competitive skill, but the effects of scale haven’t been investigated in data-driven weather models. FCN3 is unusual in its computational ambition. To scale it, we introduce a novel paradigm for model-parallelism inspired by domain decomposition in traditional numerical weather modeling. 

Figure 3. Probabilistic scores of FCN3 computed on 12-hourly initial conditions throughout the out-of-sample validation year 2020. From top to bottom: continuously ranked probability score (CRPS), ensemble-mean root mean square error (RMSE), spread-skill ratio (SSR), and rank histograms are displayed

This approach enables us to fit larger models into VRAM during training by splitting the model across multiple devices, while lowering the disk I/O per device. To enable this, spatial operations such as convolutions are implemented in a distributed fashion using the NVIDIA Collective Communications Library (NCCL). Using this technology, FCN3 is trained on up to 1,024 GPUs, using simultaneous domain, batch, and ensemble parallelism. Check out our training code.

FourCastNet3 outperforms the best physics-based ensemble model, IFS-ENS, and matches Gencast in terms of predictive skill (see Figure 3). On a single NVIDIA H100, FCN3 produces a single 15-day forecast at 6 hourly temporal resolution and 0.25° spatial resolution in a minute—an 8x speedup over Gencast and a 60x speedup over IFS-ENS. 

Its probabilistic ensembles exhibit spread-skill ratios consistently near one, indicating well-calibrated forecasts where the predicted uncertainty aligns closely with observed atmospheric variability. Rank histograms and additional diagnostics confirm that ensemble members remain interchangeable with real-world observations, affirming the reliability and trustworthiness of FCN3’s predictions. 

Critically, FCN3 preserves atmospheric spectral signatures across all scales, faithfully reproducing the energy cascade and sharpness of real-world weather patterns even at extended lead times of up to 60 days. Unlike many ML models that blur high-frequency features or devolve into noisy artifacts over time, FCN3 maintains stable, physically realistic spectra—enabling accurate, sharp, and physically consistent forecasts well into the subseasonal range. 

This is shown in Figure 3, which depicts FCN3 predictions of 500 hPa wind intensities initialized on February 11, 2020, shortly before Storm Dennis made its landfall over Europe. FCN3 accurately captures the magnitude of wind intensities and their variability across different length scales, illustrated by the faithful angular power spectral density of the respective predictions. This remains even at extended rollouts of 30 days (720 hours) or longer.

Figure 4. FourCastNet3 prediction of Atorm Dennis initialized on 2020-02-11 at 00:00:00 UTC. The plot depicts wind speeds at a pressure level of 850hPa and isohypses (height contours) of the 500hPa geopotential height.

Getting started with FourCastNet3

The fully trained FourCastNet3 checkpoint is available on NVIDIA NGC.

An easy way to run FCN3 inference is using Earth2Studio. To run a single 4-member ensemble inference, you can execute the following code:

from earth2studio.models.px import FCN3from earth2studio.data import NCAR_ERA5from earth2studio.io import NetCDF4Backendfrom earth2studio.perturbation import Zerofrom earth2studio.run import ensemble as runimport numpy as np# load default packagemodel = FCN3.load_model(FCN3.load_default_package())# determine output variablesout_vars = ["u10m", "v10m", "t2m", "msl", "tcwv"]# data source initial conditionds = NCAR_ERA5()io = NetCDF4Backend("fcn3_ensemble.nc", backend_kwargs={"mode": "w"})# no perturbation required due to hidden Markov formulation of FCN3perturbation = Zero()# invoke inference with 4 ensemble membersrun(time=["2024-09-24"],    nsteps=16,    nensemble=4,    prognostic=model,    data=ds,    io=io,    perturbation=perturbation,    batch_size=1,    output_coords={"variable": np.array(out_vars)},)

Results from this inference are depicted in Figure 4. For optimal FCN3 performance, we recommend installing torch-harmonics with custom CUDA extensions enabled and using automatic mixed precision in bf16 format during inference (which is the default in Earth2Studio). If you want to run custom FCN3 inference or train it yourself, you can find the code in makani

Figure 4. FourCastNet3 predictions at 96h lead time generated with the Earth2Studio script. The run was initialized on 2024-09-24 at 00:00:00 UTC. The top row depicts the tcwv (total column water vapor) field and the u10m (10-meter zonal wind velocity) field of ensemble member 2, respectively. The bottom row shows the standard deviation of both fields taken over all four ensemble members

Learn more about FCN3

Learn more about FourCastNet3 with these resources:

Full author list

Boris Bonev (NVIDIA), Thorsten Kurth (NVIDIA), Ankur Mahesh (LBNL), Mauro Bisson (NVIDIA), Karthik Kashinath (NVIDIA), Anima Anandkumar (Caltech), William D. Collins (LBNL), Mike Pritchard (NVIDIA), Alex Keller (NVIDIA)

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

FourCastNet3 FCN3 NVIDIA Earth-2 AI天气预报 机器学习 数值天气预报 深度学习 GPU加速 AI Weather Forecasting Machine Learning Numerical Weather Prediction Deep Learning GPU Acceleration
相关文章