*This note is still open
The Primal Question of Optimization
For a general optimization problem, it usually could be rewritten as maximizing or minimizing a certain function with several constrictions. For example, maybe you want the optimized value non-negative. The most basic form of such is called the primal question which looks like this:
\begin{matrix} \underset{x}{min}f(x), x \in \mathbb{R}^n \\ s.t. \\ g_i(x)\leq 0, i=1,2,…,m \\ h_i(x)= 0, i=1,2,…,p \end{matrix}And we call f(x) the objective function and the rest two the constriction function.
The Lagrangian Function
To solve such primal question with constrictions, we can use the Lagrange multiplier to encode the three functions into one:
L(x,u,v)=f(x)+\sum_{i=1}^{m}u_ig_i(x)+\sum_{i=1}^{m}v_ih_i(x)As such, we call u_i,v_i the lagrange multipliers.
And we make u_i \geq 0, v_i \in \mathbb{R} so that:
Because g_i(x) \leq 0, so that the maximum of \sum_{i=1}^{m}u_ig_i(x) is 0. And since h_i(x)=0, so \sum_{i=1}^{m}v_ih_i(x) is also equal to 0. Thus:
\underset{u,v}{max}L(x,u,v)=f(x)+0+0=f(x)In this way, we find:
\underset{x}{min}f(x)=\underset{x}{min}\underset{u,v}{max}L(x,u,v)The Lagrangian Dual Function
But we find the expression above is hard to solve, so we need to transfer it to a dual function as such:
Define D the feasible domain:
g(u,v)=\underset{x\in D}{inf}L(x,u,v)=\underset{x\in D}{inf}[f(x)+\sum_{i=1}^{m}u_ig_i(x)+\sum_{i=1}^{m}v_ih_i(x)]
the lagrangian dual function(g(x))
while the function does not have a lower bound, we define the value as -\infty, so that the dual is a concave function, and since we are trying to find the maximum, it could be treated as a convex optimization problem.
Because:
\underset{x}{min}f(x)=\underset{x}{min}\underset{u,v}{max}L(x,u,v)\geq \underset{u,v}{max}\underset{x}{min}L(x,u,v)= \underset{u,v}{max}g(x)Proof of the Minimax Theorem
euclid.kmj_.1138038812Dual gap
we use p^* to represent the optimized solution of the primal problem:
p^*=\underset{x}{min}f(x)And d^* to represent the optimized solution of the lagrangian dual:
d^*=\underset{u,v}{max}g(u,v)And because the minimax theorem:
p^* \geq d^*We are using d^* to approach p^* and calling p^*-d^* the dual gap
When the dual gap is zero, we call the situation as strong dual, otherwise the weak dual.
I am not going to write about KKT&Slater for now…*
$latex \int_{-\infty}^\infty e^{-x^2}\,dx=\sqrt\pi$