OpenFold3：AI驱动的蛋白质结构预测新里程碑

For decades, one of biology’s deepest mysteries was how a string of amino acids folds itself into the intricate architecture of life. Researchers built painstaking simulations and statistical models, inching toward an answer but never crossing the threshold of prediction at scale.

Then, deep learning changed everything. By learning the language of evolution directly from sequence data, AI began to uncover the hidden rules of molecular form, transforming structure prediction from an art into an engineering discipline.

Today, that transformation reaches a new milestone. OpenFold3 brings production-ready protein AI into the NVIDIA ecosystem, uniting open science with enterprise-grade performance. Developed by the OpenFold Consortium and accelerated by NVIDIA, OpenFold3 extends structure prediction beyond single proteins to model multi-chain complexes, nucleic acids, and small-molecule ligands—the complete grammar of biological interaction.

With NVIDIA cuEquivariance for symmetry-aware GPU acceleration, compatibility with MMseqs2-GPU for rapid sequence search, and NVIDIA FLARE for federated training, OpenFold3 delivers unprecedented speed, scale, and privacy-preserving collaboration for biopharma and biotech teams worldwide.

OpenFold3 is now available and, as an NVIDIA NIM, with additional acceleration. This post walks you through how to use the OpenFold3 NIM for your structure prediction work.

Quick links to get started

Prerequisites

Structure prediction with the OpenFold3 NIM

With OpenFold3 NIM, structure prediction can move from prototype to production in just a few steps, as detailed below.

Step 1: Access the model

OpenFold3 NIM is available through build.nvidia.com. You can deploy the container locally, on a cluster, or as a managed NIM service.

docker pull nvcr.io/nim/openfold/openfold3:latestexport LOCAL_NIM_CACHE=~/.cache/nimexport NGC_API_KEY=<Your NGC API Key>docker run --rm --name openfold3 \    --runtime=nvidia \    --gpus 'device=0' \    -p 8000:8000 \    -e NGC_API_KEY \    -v $LOCAL_NIM_CACHE:/opt/nim/.cache \    --shm-size=16g \    nvcr.io/nim/openfold/openfold3:latest

Step 2: Submit a structure prediction job

Once deployed, you can interact with the API using standard REST calls or Python clients:

#!/usr/bin/env python3import requestsimport osimport jsonfrom pathlib import Path# Define output file and inference endpointoutput_file = "output.json"url = "http://localhost:8000/biology/openfold/openfold3/predict"# Define protein sequenceprotein_sequence = "MGREEPLNHVEAERQRREKLNQRFYALRAVVPNVSKMDKASLLGDAIAYINELKSKVVKTESEKLQIKNQLEEVKLELAGRLEHHHHHH"# Define MSA alignment in CSV formatmsa_alignment_csv = "key,sequence\n-1,MGREEPLNHVEAERQRREKLNQRFYALRAVVPNVSKMDKASLLGDAIAYINELKSKVVKTESEKLQIKNQLEEVKLELAGRLEHHHHHH"# Define DNA sequences (complementary pair)dna_sequence_b = "AGGAACACGTGACCC"dna_sequence_c = "TGGGTCACGTGTTCC"# Build request datadata = {    "request_id": "5GNJ",    "inputs": [        {            "input_id": "5GNJ",            "molecules": [                {                    "type": "protein",                    "id": "A",                    "sequence": protein_sequence,                    "msa": {                        "main_db": {                            "csv": {                                "alignment": msa_alignment_csv,                                "format": "csv",                            }                        }                    }                },                {                    "type": "dna",                    "id": "B",                    "sequence": dna_sequence_b                },                {                    "type": "dna",                    "id": "C",                    "sequence": dna_sequence_c                }            ],            "output_format": "pdb"        }    ]}r = requests.post(url=url, json=data)# Save the json outputprint(r, "Saving to output.json:\n", r.text[:200], "...")Path(output_file).write_text(r.text)

Predictions include 3D coordinates (PDB/mmCIF) and confidence metrics such as pLDDT, pTM, and ipTM, all delivered in seconds on NVIDIA H100 Tensor Core GPUs.

A new open standard for protein structure prediction

The OpenFold Consortium, an industry-led coalition including Bayer, Bristol Myers Squibb, Johnson & Johnson, Novo Nordisk, Outpace Bio, and others, has been instrumental in advancing open, reproducible modeling systems.

OpenFold3 represents the consortium’s most significant milestone yet. The model extends structure prediction to multimers, protein–DNA/RNA complexes, and ligand-bound assemblies, achieving accuracy that meets or exceeds leading open-source models.

Notably, OpenFold3 reaches parity with AlphaFold3 performance on protein–nucleic acid benchmarks, an area where earlier models have traditionally lagged. It is also classified as a Class 1 open-source system under the Linux Foundation open model definitions, ensuring full transparency and reproducibility.

Open science meets enterprise reliability

OpenFold3 is optimized for the NVIDIA accelerated AI computing stack, including:

cuEquivariance

MMseqs2-GPU

NVIDIA FLARE

Together, these integrations make OpenFold3 NIM both developer-accessible and enterprise-deployable—a drop-in service for on-prem, hybrid, and cloud environments. NVIDIA TensorRT enables up to 1.8x faster inference for large multimers and nucleic acid complexes.

OpenFold3 has been validated in secure federated workflows by Apheris and SandboxAQ, proving its ability to scale across global pharma R&D environments. Federated pipelines enable partners to fine-tune on proprietary data, such as antibody–antigen complexes or RNA–ligand assemblies, without moving datasets across institutional boundaries.

And because OpenFold3 is a Class 1 open system according to the Linux Foundation open model definitions, the software and consortium benefit from a rapidly growing ecosystem of contributors and benchmarks. This ensures continuous improvement and long-term reliability.

With NVIDIA FLARE integration, organizations can train OpenFold3 collaboratively across multiple sites, such as pharma partners, research consortia, and hospitals, without sharing sensitive data.

This approach supports regulatory compliance (GDPR and HIPAA, for example) while unlocking improvements to models from diverse datasets that would otherwise remain siloed.

Building the future of open protein AI

OpenFold3 is more than a model. It’s a foundation for the next decade of protein AI. It reflects the convergence of more than 40 institutions at OpenFold Consortium, open source science, accelerated computing, and federated collaboration, ensuring that the tools used by global researchers can also meet enterprise reliability and security standards.

Acknowledgments

Special thanks to the OpenFold Consortium and partners, including SandboxAQ and Apheris, for their collaboration in advancing open, accelerated AI for molecular science.

Quick links to get started

Prerequisites

Structure prediction with the OpenFold3 NIM

Step 1: Access the model

Step 2: Submit a structure prediction job

A new open standard for protein structure prediction

Open science meets enterprise reliability

Building the future of open protein AI

Acknowledgments

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签