NVIDIA Parabricks v4.6：加速基因组分析，支持DeepVariant与STAR新功能

Built for data scientists and bioinformaticians, NVIDIA Parabricks is a scalable genomics software suite for secondary analysis. Providing GPU-accelerated versions of open-source tools for increased speed and accuracy, researchers can uncover biological insights faster.

The latest release, Parabricks v4.6, offers improvements to multiple features, most notably support for Google’s DeepVariant and DeepSomatic 1.9. This includes a pangenome-aware mode for DeepVariant, which improves accuracy across genetic variations and diverse populations.

New features:

DeepVariant and DeepSomatic 1.9, including pangenome-aware DeepVariant.DeepSomatic long read and whole exome sequencing (WES) support.STAR quantMode including GeneCounts.

Improved features:

STAR speedups: Almost 8x faster on two NVIDIA RTX PRO 6000 GPUs compared to CPU-only solutions.Additional arguments for Mutectcaller, including mitochondrial mode.

Improve variant calling with DeepVariant and DeepSomatic 1.9

Variant calling is a critical step in genomic analysis. It identifies differences between the sample genome (i.e., an individual or population) and a reference genome. Understanding these genetic differences gives scientists a better understanding of diseases and potential treatments.

There is a wide variety of tools built to address variant calling, including HaplotypeCaller and Mutect2 in the Genomic Analysis Toolkit (GATK) from the Broad Institute. In addition to the industry standards from GATK, deep-learning-based variant callers have become widely used.

Developed by Google, DeepVariant and DeepSomatic use deep learning to support variant identification. For germline data, DeepVariant determines inherited variants. On the other hand, DeepSomatic shows how somatic variants affect non-inherited mutations, including those found in tumor cells.

Enhancing variant calling accuracy is critical, particularly when considering genetic diversity. According to a recent paper, pangenome-aware DeepVariant reduced errors by up to 25.5% across all settings when compared to linear-referenced-based DeepVariant.

“Taking genetic diversity into account is critical to accurate genome analysis, especially across diverse populations. New pangenome methods allow more comprehensive maps of genetic variation to inform analysis,” says Andrew Carroll, product lead at Google Research. “I’m excited by Parabricks v4.6 support for pangenome-aware DeepVariant v1.9, which combines the incredible speed of Parabricks with the new DeepVariant ability to directly use pangenome information during variant calling.”

Improve accuracy even more with Giraffe and DeepVariant v1.9

Traditional linear references, including the Genome Reference Consortium Human Build 38 (GRCh38), are built from the DNA of only a few individuals, providing a universal coordinate system for genomic research. However, these references don’t capture the full spectrum of genetic variation present across the broader human population. As a result, important subpopulation diversity is often underrepresented. This can introduce bias into subsequent analyses, such as read mapping and variant detection, which may miss or inaccurately interpret important genetic differences tied to ancestry or disease.

Unlike linear references, pangenomes are built by integrating multiple high-quality genomes from diverse individuals, capturing a much broader range of genetic variation present in human populations. This comprehensive approach reduces reference bias, improves variant detection across populations, and supports more accurate and equitable genomic analyses. Giraffe, a software tool developed by researchers at the University of California, Santa Cruz, enables efficient read alignment to pangenome graphs.

Giraffe maps genomic sequences to a reference pangenome rather than a traditional linear reference, improving variant-calling accuracy across diverse populations. Combining Giraffe with pangenome-aware mode in DeepVariant, which is now available in Parabricks v4.6, improves the accuracy of identified variants and provides the speed of Parabricks GPU acceleration.

Accuracy

Pangenome-aware DeepVariant

Pangenome-aware DeepVarian

BWA

Speed

*Figure 1. Using four NVIDIA RTX PRO 6000 GPUs, the total runtime for pangenome-aware DeepVariant 1.9 and Giraffe reduced from more than 9 hours on CPU-only solution to under 40 minutes*

“Roche’s SBX technology enables sequencing at unparalleled data rates and flexible data processing workflows for different sequencing applications,” says John Mannion, VP Computational Sciences at Roche. “Through our collaboration with NVIDIA, we plan to leverage GPU-accelerated versions of multiple aligners, including Giraffe, to provide users with an integrated solution allowing for faster and more accurate analysis.”

Get started with Giraffe and DeepVariant

Existing users of Parabricks can run DeepVariant after providing:

the appropriate FASTA reference file from the Giraffe index files, a BAM file and the graph GPZ file output from running Giraffe.

Instructions on obtaining these files are available in the Parabricks Giraffe documentation focused on Using Giraffe in Variant Calling workflows. The following steps also guide you through the process.

Step 1

Run baseline VG to generate a FASTA file from the graph.

Please note that step 1 with baseline VG is a one-time run. Once you have the FASTA file from the graph, you don’t need to run step 1. Instead, run steps 2 and 3 to handle more FASTQ samples.

# Extract the sequences corrresponding to the list of paths to a FASTA filedocker run --rm --volume $(pwd):/workdir \    --workdir /workdir \    quay.io/vgteam/vg:v1.59.0 \    vg paths -x hprc-v1.1-mc-grch38.gbz -p hprc-v1.1-mc-grch38.paths.sub -F > hprc-v1.1-mc-grch38.fa# Index the fasta filesamtools faidx hprc-v1.1-mc-grch38.fa

Step 2

Next, run Giraffe normally.

# This command assumes all the inputs are in the current working directory and all the outputs go to the same place.docker run --rm --gpus all --volume $(pwd):/workdir --volume $(pwd):/outputdir \    --workdir /workdir \    nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 \    pbrun giraffe --read-group "sample_rg1" \    --sample "sample-name" --read-group-library "library" \    --read-group-platform "platform" --read-group-pu "pu" \    --dist-name /workdir/hprc-v1.1-mc-grch38.dist \    --minimizer-name /workdir/hprc-v1.1-mc-grch38.min \    --gbz-name /workdir/hprc-v1.1-mc-grch38.gbz \    --ref-paths /workdir/hprc-v1.1-mc-grch38.paths.sub \    --in-fq /workdir/${INPUT_FASTQ_1} /workdir/${INPUT_FASTQ_2} \    --out-bam /outputdir/${OUTPUT_BAM}

Step 3

Finally, these three files can be used as inputs for Deep Variant. Run pangenome_aware_deepvariant with the BAM from step 2, FASTA from step 1, and the graph GBZ file.

# Pangenome_aware_deepvariant# This command assumes all the inputs are in the current working directory and all the outputs go to the same place.docker run --rm --gpus all --volume $(pwd):/workdir --volume $(pwd):/outputdir \    --workdir /workdir \    nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 \    pbrun pangenome_aware_deepvariant \    --ref /workdir/hprc-v1.1-mc-grch38.fa \    --pangenome /workdir/hprc-v1.1-mc-grch38.gbz \    --in-bam /workdir/${INPUT_BAM} \    --out-variants /outputdir/${OUTPUT_VCF}

STAR improvements: including quantMode GeneCounts

In addition to pangenome-aware mode for DeepVariant, the latest release of Parabricks also includes improvements to STAR. STAR is a tool used to accelerate RNA-sequencing alignment. It is particularly useful due to its speed and accuracy for RNA-seq data across sequencing platforms and scalability for large datasets. Already available in Parabricks, STAR is further accelerated thanks to GPU-acceleration–resulting in nearly 8x faster speedups on two NVIDIA RTX PRO 6000 GPUs compared to CPU-only solutions.

In the latest release of Parabricks, quantMode GeneCounts is a new option available for STAR, which is valuable for a variety of applications relevant to gene expression, QC, normalization, and data integration. During the mapping step of alignment, quantMode GeneCounts enables fast generation of gene-level read counts.

Figure 2. Compared to CPU-only solutions that took over 105 minutes, STAR runtimes were reduced to under 14 minutes on two NVIDIA RTX PRO 6000 GPUs Get started with STARQuantMode GeneCounts can be run as an argument that can be added to STAR. An example command is below.

docker run --rm --gpus all --volume $(pwd):/workdir --volume $(pwd):/outputdir \    --workdir /workdir \nvcr.io/nvidia/clara/clara-parabricks:4.6.0-1 \pbrun rna_fq2bam \--genome-lib-dir ${GENOME_DIR} \--in-fq ${FASTQ1} ${FASTQ2} \--output-dir ${OUT_DIR} \--ref ${GENOME} \--out-bam ${OUT_BAM} \--num-gpus ${GPU_NUM} \--quantMode GeneCounts

Download Parabricks todayDownload NVIDIA Parabricks v4.6 to get started with GPU-accelerated genomic analysis and join the conversation on the NVIDIA Parabricks Developer Forum.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

NVIDIA Parabricks 基因组学 GPU加速 DeepVariant DeepSomatic STAR RNA测序 pangenome 生物信息学 NVIDIA Genomics GPU Acceleration RNA-Seq Bioinformatics

AI for Ecology and Ecosystem Preservation with Bryan Carstens - #449

SLIDE: Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning with Beidi Chen - #356

Brendan Frey - Reprogramming the Human Genome with AI - TWiML Talk #12

NVIDIA Grace Hopper™ Superchips to Speed Scientific Research and Discovery

Introducing NVIDIA’s CUDA-Q™ Platform For Quantum Computing

Neural Networks and Nucleotides: AI in Genomic Manufacturing

Method identified to double computer processing speeds

Top AI Tools for Genomics, Drug Discovery, And Machine Learning

This AI Paper Introduces Evo: A Genomic Foundation Model that Enables Prediction and Generation Tasks from the Molecular to Genome-Scale

AI-Powered Genomic Analysis: Transforming Precision Medicine through Advanced Data Interpretation

.footer { width: 100%; /* 原先页面已经预留了空间 */ /* height: 2.3rem; */ position: relative; } .footer.padding-bottom{ padding-bottom: 1.2rem; } .footer .fixed-footer { position: fixed; bottom: 0; left: 0; width: 100%; height: 2.3rem; background-color: #191919; z-index: 100; } .footer.padding-bottom .fixed-footer{ padding-bottom: 1.2rem; } .footer .fixed-footer .flex-content{ position: absolute; top: 0; left: 0; right: 0; bottom: 0; height: 2.3rem; display: flex; box-sizing: border-box; align-items: center; justify-content: space-between; padding:0 .55rem; } .footer .icon-left, .footer .icon-right{ position: absolute; width: .55rem; height: .55rem; top: -0.54rem; } .footer .icon-left{ left: 0; } .footer .icon-left::after{ position: absolute; width: .55rem; height: .55rem; content: ''; bottom: -0.01rem; left: -0.01rem; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAABC0lEQVQ4T63TMUrEUBDG8e+L5AYWeQFLC72A17Cw8gYqXkDBTryAoBaWixbWWlor1haW8pKZ4mElFj42I4GNhe6ym82bA/z4w8wQwAqAMRINy7Jcres6JPLAoig2VfU1JbijqnfJQOfcqYgcpwSfRGQrJTiOMa6FEOoUKJ1zBuBIRM5Sgu8isg7geyjaFcLM9lX1IhkIIJDcGHrkv4WTshsR2R1S+RdsrT0RuVwW/QeaWQSwrar3y6DTClvny8zal3zoi84C261HkocictUHnQl2CMnbLMsOvPcfi8BzwQkSzOxEVa/nHf+iYBfnSZ6THFVV5acV9wU7ozGzZ5KPAF6apnnL87z23n/+ADjcghv4tAnCAAAAAElFTkSuQmCC'); background-size: 100% 100%; background-repeat: no-repeat; } .footer .icon-right{ right: 0; } .footer .icon-right::after{ position: absolute; width: .55rem; height: .55rem; content: ''; bottom: -0.01rem; right: -0.01rem; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAABC0lEQVQ4T63TMUrEUBDG8e+L5AYWeQFLC72A17Cw8gYqXkDBTryAoBaWixbWWlor1haW8pKZ4mElFj42I4GNhe6ym82bA/z4w8wQwAqAMRINy7Jcres6JPLAoig2VfU1JbijqnfJQOfcqYgcpwSfRGQrJTiOMa6FEOoUKJ1zBuBIRM5Sgu8isg7geyjaFcLM9lX1IhkIIJDcGHrkv4WTshsR2R1S+RdsrT0RuVwW/QeaWQSwrar3y6DTClvny8zal3zoi84C261HkocictUHnQl2CMnbLMsOvPcfi8BzwQkSzOxEVa/nHf+iYBfnSZ6THFVV5acV9wU7ozGzZ5KPAF6apnnL87z23n/+ADjcghv4tAnCAAAAAElFTkSuQmCC'); background-size: 100% 100%; background-repeat: no-repeat; transform: rotateY(180deg); } .footer .flex-content .open-weapp { position: absolute; top: 0; left: 0; width: 100%; height: 100%; z-index: 10; opacity: 0; } .footer .flex-content .footer-left, .footer .flex-content .footer-right { position: relative; font-weight: bold; } .footer .flex-content .footer-left{ width: 4.35rem; height: 1.25rem; } .footer .flex-content .footer-left .footer-left-content { position: absolute; top: 0; left: 0; width: 100%; height: 100%; display: flex; align-items: center; color: #D4D4D4; font-size: .65rem; } .footer .flex-content .footer-left .footer-left-content .logo{ width: 1.1rem; height: 1.1rem; background-image: url('http://app.myzaker.com/news/images/logo_icon.png'); background-size: 100% 100%; background-repeat: no-repeat; margin-right: .35rem; border-radius: 50%; } .footer .flex-content .footer-right{ width: 4.38rem; height: 1.25rem; line-height: 1.25rem; display: block; box-sizing: border-box; } .footer .flex-content .footer-right .open-weapp-btn{ position: absolute; top: 0; left: 0; right: 0; bottom: 0; background-color: #2B2B2B; border-radius: .15rem; color: #D4D4D4; font-size: .65rem; text-align: center; display: block; } var browser = { versions: (function () { var u = navigator.userAgent.toLowerCase(), isPad = false,isAndroidPad = false,isIpad = false,isMobile = false,isPc = false; if(u.indexOf('')) if(u.indexOf('android') > -1){ if(u.indexOf('mobile') == -1){ isAndroidPad = true; } } if(u.indexOf('ipad') > -1){ isIpad = true; } if(isAndroidPad||isIpad){ isPad = true; }else if((u.indexOf('mobile') > -1 && !isPad ) || (u.indexOf('android') > -1 && !isAndroidPad) || (u.indexOf('phone') > -1)){ isMobile = true; }else{ isPc = true; } return { android: u.indexOf('android') > -1 || u.indexOf('Linux') > -1, iPhone: u.indexOf('iphone') > -1, isPad: isPad, isMobile:isMobile, isPc:isPc, wx:u.toLowerCase().indexOf('micromessenger') > -1, }; })() } var checkInZaker = function(){ if (navigator.appinfo || navigator.userAgent.match(/zaker/ig)) { return true; } return false; } if( location.href.indexOf('mobile=1')<0 && (browser.versions.isPc || browser.versions.isPad) ){ var style = '<style type="text/css">'; style+= 'html{background-color:#f8f8f8;}'; style+= '#body{width:720px;margin:0 auto;background-color:#fff;border-left:1px solid #e8e8e8;border-right:1px solid #e8e8e8;font-style:normal}'; style+= '#temple_title,#content_text,.icon-font-origin,#top5{padding:0 50px;}'; style+= '#downTips{width:720px;}'; style+= '#qrcode{position:fixed;background-color:#fff;margin:44px 0 0 740px;}'; style+= '#downTips{display:none;}'; style+= '</style>'; document.write(style); } var _$ = function(id){return document.querySelector(id);}, isWap = true; var qrcodeHtml = '' if(location.href.indexOf('mobile=1')<0 && browser.versions.isPc){ qrcodeHtml += '<img id="qrcode" src="/static/image/qrcode_dingyuehao.jpg"/>' } qrcodeHtml += '<div class="zk_top_barwrap"><div class="zk_top_bar"><a href="/" class="zk_top_bar_logo"></a></div></div>' $('#body').prepend(qrcodeHtml); var new_style = ''; var vo = document.createElement("a"); vo.className = 'icon-font-origin-btn'; vo.style.borderBottom = 'none'; vo.style.color = '#00abff'; vo.style.marginLeft = '0px'; if(new_style){ vo.style.cssText="border-bottom-style: none;font-size: 11px;color: #ababab;margin-left: 6px;"; document.getElementById('ID_disclaimer').style.cssText='text-align: left;color:#ababab;font-size: 16px;line-height: 32px;padding:0;padding-top: 4px;'; } vo.href = 'https://developer.nvidia.com/blog/improve-variant-calling-accuracy-with-nvidia-parabricks/'; vo.innerHTML = '查看原文'; var el_disclaimer = _$("#ID_disclaimer"); if(el_disclaimer){ el_disclaimer.appendChild(vo); } //图片初始化 (function () { var imglazy = document.querySelectorAll('.img_box .lazy'); imglazy = Array.prototype.slice.call(imglazy); imglazy.forEach(function(img){ // 获取宽高 var dWidth = img.dataset['width']; var dHeight = img.dataset['height']; // 获取父元素 var parentEle = img; do{ parentEle = parentEle.parentNode; } while(!parentEle.classList.contains('img_box') || parentEle.id == "content") // 获取图片的父容器占宽 var parentWidth = parentEle.offsetWidth; // 1. 图片原宽度大于容器宽度70%，撑到100% // 2. 图片原宽度大于容器宽度40%，小于容器宽度70%，保持图片原尺寸 // 3. 图片原宽度小于容器宽度40%，撑到40% var maxRate = 0.7; var minRate = 0.4; // 计算阀值 var maxWidth = maxRate * parentWidth; var minWidth = minRate * parentWidth; // 最终设定图片的宽高 var imgWidth, imgHeight; if (dWidth) { if (dWidth > maxWidth) { imgWidth = parentWidth; } else if (dWidth > minWidth) { imgWidth = dWidth; img.parentNode.style['display'] = 'inline-block'; // img.parent('.content_img_div').css('display', 'inline-block'); } else { imgWidth = minWidth; img.parentNode.style['display'] = 'inline-block'; // img.parent('.content_img_div').css('display', 'inline-block'); } // 计算高度 imgHeight = dHeight / dWidth * imgWidth; } else { imgWidth = parentWidth; } // 设置图片大小 img.style['width'] = imgWidth + "px"; img.style['height'] = imgHeight + "px"; }); })(); var inzaker = (navigator.userAgent.match(/zaker/ig)) ? true : false; if(!inzaker && !navigator.userAgent.match(/AlipayClient/ig) ){ if(document.querySelector('.ntpl_head')){ (function(){ function getStyle(obj,attr){ if(obj.currentStyle){ return obj.currentStyle[attr]; }else{ return document.defaultView.getComputedStyle(obj,null)[attr]; } } var $ntplHead = document.querySelector('.ntpl_head'), pt = getStyle($ntplHead, 'paddingTop'); $ntplHead.style.paddingTop = (parseInt(pt, 10) - 20)+'px'; })(); } } window.zkgetWebConfig = function(data) { inzaker = true; if(data.appType == 'elderly'){ document.getElementsByTagName('body')[0].className += ' body_elderly'; } }; window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-LT4LDFPVLZ');