NVIDIA cuOpt推出GPU加速线性规划新求解器

How does the NFL schedule all its regular-season games while avoiding stadium conflicts with Beyoncé concerts?

How can doctors use a single donated kidney to trigger a chain of transplants, and how do they create the longest possible transplant chain to save the most patients?

How do airlines plan flight crew schedules around rest requirements and crew locations while minimizing hotel and deadheading costs?

These types of problems can be solved using optimization, a branch of applied mathematics that uses linear programming (LP) and mixed integer programming (MIP) to find optimal solutions. However, solving such problems isn’t easy because they often contain millions of variables and constraints that can slow CPUs. Approximate solutions can sometimes be found quickly, but finding highly accurate solutions takes longer.

NVIDIA cuOpt is an open source GPU-accelerated library for solving optimization problems. The latest cuOpt release includes a new solver for linear programs that uses the barrier method. This new solver complements the existing cuOpt LP solvers, enabling engineers, researchers, and developers to solve large-scale linear programs quickly and accurately.

Internal benchmarks show cuOpt barrier delivering over 8x average speedup compared to a leading open source CPU solver and over 2x average speedup compared to a popular commercial CPU solver on a public test set of large linear programs (Figure 1).

When cuOpt barrier is combined with cuOpt PDLP in concurrent mode, it ranked first among open source solvers and second overall out of eleven solvers in public benchmarks (retrieved October 20, 2025).

This post explains the basics of linear programs, introduces the new cuOpt GPU-accelerated barrier method, and dives deeper into the benchmark data on its performance.

*Figure 1. Average speedup on Mittelmann test set on NVIDIA GH200 Grace Hopper*

What are linear programming and mixed integer programming?

Linear programming is a technique for solving optimization problems where a linear objective function is minimized subject to linear constraints. A wide variety of applications can be cast as linear programs, such as blending and mixing problems in food and chemical production, transportation, network flow problems, and portfolio optimization in finance.

Linear programming also plays a crucial role in mixed integer programming, a technique for solving optimization problems where some variables take on integer, or whole number, values. Mixed integer programming is a crucial tool in the field of operations research that allows practitioners to solve problems with discrete decisions. There are many applications of mixed integer programming, including:

How are linear programs solved?

There are three main methods for solving linear programs: simplex, PDLP, and barrier.

Simplex:

simplex method

invented by George Dantzig

PDLP:

Primal-Dual Hybrid Gradient for Linear Programming (PDLP)

Barrier:

predictor-corrector method

Each method excels in different problems and in different use cases (Table 1).

Method	When to use
Simplex	–Best for small to medium LPs –Requires a factorization to fit into memory –Produces the highest-accuracy solutions –Solutions lie at a vertex of the feasible region
PDLP	–Fast on large LPs –No factorization required –Low-accuracy solutions must be acceptable
Barrier	–Fast on large LPs –Requires a factorization to fit into memory –Produces high-accuracy solutions

Table 1. The three methods for solving linear programming problems and their ideal applications, and specific requirements

How does cuOpt solve linear programs?

When using cuOpt, you don’t need to choose which method to solve linear programs. By default, cuOpt runs all three algorithms concurrently and provides the solution from the method that finishes first.

cuOpt contains an implementation of PDLP that can very quickly solve LPs to low accuracy (a relative tolerance of 1e-4 to 1e-6, for example). However, many applications of linear programming require more accurate solutions (a relative tolerance of 1e-8 or an absolute tolerance of 1e-6, for example).

Previously, when a cuOpt user required a highly accurate solution, the only option was to use the simplex algorithm. While the simplex algorithm is well-suited for small- to medium-scale LPs, it can struggle on large-scale LPs.

With the introduction of the new barrier method, cuOpt extends its high-performance GPU-accelerated LP solvers to large-scale LPs where high accuracy is required.

How does the cuOpt GPU-accelerated barrier solver work?

The cuOpt barrier method uses the same well-proven predictor-corrector algorithm developed by researchers in the 1990s, following the book Primal-Dual Interior-Point Methods and the paper On Implementing Mehrotra’s Predictor-Corrector Interior-Point Method for Linear Programming.

The predictor-corrector method requires solving multiple sparse systems of linear equations at each iteration. Until recently, these were solved using multi-threaded algorithms on the CPU. This changed with the release of NVIDIA cuDSS, the NVIDIA GPU-accelerated sparse direct-solver library. cuDSS significantly outperforms CPU algorithms. cuDSS enabled the cuOpt team to bring the Mehrotra predictor-corrector algorithm to the GPU.

The latest release, cuDSS 0.7, includes several enhancements developed specifically for cuOpt:

Faster symbolic factorization

Option to stop cuDSS reordering and symbolic factorization phases

Deterministic mode

In addition to cuDSS, cuOpt uses cuSPARSE to perform sparse matrix-matrix multiplication and sparse matrix-vector multiplications efficiently on the GPU.

How does the cuOpt barrier method compare to open source and commercial CPU solvers?

Performance of the new cuOpt barrier method was measured over a publicly available test set of 61 linear programs maintained by Hans Mittelmann at Arizona State University. For details, see Benchmarks for Optimization Software.

The linear programs in this test set are large. Figure 2 shows the distribution of these problems in terms of the number of variables and constraints—each dot represents an LP. This test set has about a dozen problems with more than 1 million variables and constraints.

*Figure 2. Dimensions of LPs in the Mittelmann test set*

Experimental setup

Runtime was compared for the cuOpt barrier method versus several CPU solvers. All solvers were run on an NVIDIA GH200 Grace Hopper machine with 72 3.4GHz CPU cores, 500 GB of RAM, and an NVIDIA H200 GPU with 480 GB of VRAM. Each solver was configured to run the barrier method without crossover; all other settings were taken as default. All problems were run with a time limit of one hour. If a solver failed to solve a problem, the runtime was taken to be one hour.

How does the cuOpt barrier method compare to an open source CPU solver?

In the test set, cuOpt solves 55/61 problems, and the open source solver solves 48/61 problems. On average (geometric mean) over the test set, cuOpt is faster than the open source solver by 8.81x.

Figure 3 shows the speedup of the cuOpt barrier method compared to the open source solver for each of the 61 problems in the Mittelmann test set on NVIDIA GH200. Each bar represents the open source solver runtime divided by the cuOpt runtime on a particular model. Bars greater than 1 (in green) represent LPs where cuOpt is faster. Bars less than 1 (in gray) represent LPs where the open source solver is faster.

*Figure 3. cuOpt barrier speedup compared to an open source CPU solver on GH200*

How does the cuOpt barrier method compare to commercial CPU solvers?

cuOpt was also measured against two leading commercial solvers. The experimental setup was the same as previously described: all solvers were run using the barrier method without crossover and default settings.

Commercial CPU solver A

On 31/61 problems in the test set, cuOpt is faster than the commercial solver. On seven problems, cuOpt is more than 5x faster than the commercial solver—with a maximum speedup of 17x. However, on average (geometric mean), commercial solver A is 1.5x faster over the whole test set. This is likely due to the sophisticated presolve methods used by commercial solver A. cuOpt does not have an extensive presolve. Commercial solver A solves 60/61 problems in the test set.

Commercial CPU solver B

cuOpt does better against another popular commercial CPU solver B. On average (geometric mean) over the test set, cuOpt is 2x faster than commercial solver B.

This commercial solver solved 58/61 problems in the test set.

*Figure 5. cuOpt barrier speedup over commercial CPU solver B on GH200*

Get started with the NVIDIA cuOpt barrier method

The cuOpt open source solver helps solve large-scale LPs quickly and accurately. You can now run simplex, PDLP, and barrier on your largest and most challenging linear programs.

To get started, visit the cuOpt Barrier Notebook to learn how to install cuOpt 25.10, run the new barrier method, and use important settings like presolve and folding.

Download cuOpt 25.10 and run it on your toughest linear programs to see how it compares with other solvers.