Machined Learnings 09月12日
解决部分变量消除的凸优化问题
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

现代凸优化问题求解已变得十分便捷。对于形如 inf_z f(z) subject to h(z)=0, g(z)\preceq 0 的优化问题,其中 f 和 g 为凸函数,h 为仿射函数,可使用 cvxpy 等免费软件包解决。当变量众多而约束较少时,可通过求解对偶问题简化计算。但若无法完全消除原始变量,问题可能转化为 sup_x inf_y L(x,y) subject to 附加约束的形式。传统工具如 cvxpy 无法处理此类问题,尽管内点法对此类问题有效。作者提出通过序列二次极大化编程(SQM)解决:首先在当前点附近用二次函数近似目标函数并线性化约束,然后通过 Wolfe 对偶将其转化为标准二次规划问题,最终获得原始问题的牛顿步。该方法特别适用于约束为线性情况,通过初始精确内解保持可行性。

🔍 凸优化问题求解已变得十分便捷,特别是对于形如 inf_z f(z) subject to h(z)=0, g(z)\preceq 0 的标准形式,其中 f 和 g 为凸函数,h 为仿射函数,cvxpy 等免费软件包可高效处理。

🔄 当变量众多而约束较少时,通过求解对偶问题可以简化计算。对偶问题通常形如 sup_x L(x) subject to 附加约束,其中 L 为凹函数,为原始问题的可行解提供了另一种求解途径。

🤔 当无法完全消除原始变量时,问题可能转化为 sup_x inf_y L(x,y) subject to 附加约束的混合形式。尽管这种形式在理论上可以通过内点法有效求解,但传统凸优化工具如 cvxpy 无法直接处理。

🧮 作者提出通过序列二次极大化编程(SQM)解决此类问题:首先在当前点附近用二次函数近似目标函数并线性化约束,然后通过 Wolfe 对偶将其转化为标准二次规划问题,最终获得原始问题的牛顿步。

🔗 该方法特别适用于约束为线性情况,通过初始精确内解保持可行性。虽然该方法在玩具问题上表现良好,但在实际问题上可能存在收敛缓慢的问题,提示可能存在病态条件。

If you need to solve a convex optimization problem nowadays, you are in great shape. Any problem of the form $$
\begin{alignat}{2}
&!\infz & \qquad & f(z) \
& \text{subject to} & & h(z) = 0 \
& & & g(z) \preceq 0
\end{alignat}
$$ where $f$ and $g$ are convex and $h$ is affine can be attacked by several excellent freely available software packages: my current favorite is cvxpy, which is a joy to use. If you have a lot of variables and not a lot of constraints, you can instead solve a dual problem. It ends up looking like $$
\begin{alignat}{2}
&!\sup
{x} & \qquad & L(x) \
& \text{subject to} & & \tilde{h}_x(x) = 0 \
& & & \tilde{g}x(x) \preceq 0
\end{alignat}
$$ where $L$ is concave, assuming that you get lucky and can analytically eliminate all the primal variables $z$ such that only the dual variables $x$ remain. But what if you can't eliminate all the primal variables, but only most of them? You might end up with a problem like $$
\begin{alignat}{2}
&!\sup
{x} \inf_y & \qquad & L(x, y) \
& \text{subject to} & & \tilde{h}_x(x) = 0 \
& & & \tilde{g}_x(x) \preceq 0 \
& & & \tilde{h}_y(y) = 0 \
& & & \tilde{g}_y(y) \preceq 0
\end{alignat}
$$ where $\tilde{g}_x$ and $\tilde{g}_y$ are convex, and $\tilde{h}_x$ and $\tilde{h}y$ are affine, $L(x, \cdot)$ is convex in $y$ given fixed $x$, and $L(\cdot, y)$ is concave in $x$ given fixed $y$. It feels like this problem should be easier to solve than the original problem if many primal variables have been analytically eliminated. Unfortunately, none of my favorite convex optimization toolkits will accept a problem of this form. This is despite the viability of interior-point methods for such problems. Bummer.

One thing I tried was to solve the inner infimum using a standard toolkit, compute the gradient of the solution wrt the outer parameters via the sensitivity map, and then use a first-order method for the outer supremum. This did not work for me: it works for toy problems but on real problems the outer supremum has very slow convergence suggesting ill-conditioning.

What I need is the power of interior-point methods to handle ill-conditioning via second-order information. I'm able to achieve this via sequential quadratic minimax programming: first, locally approximate the objective $L(\lambda, \mu, y)$ with a quadratic around the current point and linearize the constraints. $$
\begin{alignat}{2}
&!\sup
{\delta x} \inf{\delta y} & \qquad & \frac{1}{2} \left(\begin{array}{c} \delta x \ \delta y \end{array}\right)^\top \left(\begin{array}{cc} P{xx} & P{yx}^\top \ P{yx} & P_{yy} \end{array}\right) \left(\begin{array}{c} \delta x \ \delta y \end{array} \right) + \left(\begin{array}{c} q_x \ q_y \end{array} \right)^\top \left(\begin{array}{c} \delta x \ \delta y \end{array} \right) \
& \text{subject to} & & \left(\begin{array}{cc} A_x & 0 \ 0 & A_y \end{array} \right) \left(\begin{array}{c} \delta x \ \delta y \end{array}\right) = \left(\begin{array}{c} b_x \ b_y \end{array}\right) \
& & & \left(\begin{array}{cc} G_x & 0 \ 0 & G_y \end{array} \right) \left(\begin{array}{c} \delta x \ \delta y \end{array}\right) \preceq \left(\begin{array}{c} h_x \ h_y \end{array}\right) \
\end{alignat}
$$ The Wolfe dual converts this problem into a standard QP: $$
\begin{alignat}{2}
&!\sup{\delta x, \delta y, \lambda, \mu} & \qquad & \frac{1}{2} \left(\begin{array}{c} \delta x \ \delta y \ \lambda \ \mu \end{array}\right)^\top \left(\begin{array}{cccc} P{xx} & 0 & 0 & 0 \ 0 & -P_{yy} & 0 & 0 \ 0 & 0 & 0 & 0 \ 0 & 0 & 0 & 0 \end{array}\right) \left(\begin{array}{c} \delta x \ \delta y \ \lambda \ \mu \end{array}\right) + \left(\begin{array}{c} q_x \ 0 \ -b_y \ -h_y \end{array} \right)^\top \left(\begin{array}{c} \delta x \ \delta y \ \lambda \ \mu \end{array} \right) \
& \text{subject to} & & \left(\begin{array}{cc} Ax & 0 & 0 & 0 \ P{yx} & P_{yy} & A_y^\top & G_y^\top \end{array} \right) \left(\begin{array}{c} \delta x \ \delta y \ \lambda \ \mu \end{array}\right) = \left(\begin{array}{c} b_x \ -q_y \end{array}\right) \
& & & \left(\begin{array}{cc} G_x & 0 & 0 & 0 \ 0 & 0 & 0 & -I \end{array} \right) \left(\begin{array}{c} \delta x \ \delta y \ \lambda \ \mu \end{array}\right) \preceq \left(\begin{array}{c} h_x \ 0 \end{array}\right) \
\end{alignat}
$$ If you solve this for $(\delta x, \delta y)$ you get a Newton step for your original problem. The step acceptance criterion here is tricky: if the iterate is feasible you want to leverage the saddle point condition (see equation (11) of Essid et. al.). If the iterate is infeasible more sophistication is required, but fortunately my constraints were actually linear so doing an initial exact inner solve allowed me to iterate while staying feasible. (Note: if you solve a more general convex problem on each step, you don't need to linearize the $x$ constraints.)

YMMV!

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

凸优化 内点法 序列二次极大化编程 对偶问题 变量消除 SQM cvxpy 病态条件
相关文章