10.12. Quadratic Programming#

10.12.1. Quadratic Functions#

Definition 10.41 (Quadratic function)

A function $f : R^{n} \to R$ of the form

f (x) = \frac{1}{2} x^{T} A x + b^{T} x + c

where $A \in S^{n}$ , $b \in R^{n}$ and $c \in R$ , is known as a quadratic function.

The matrix $A$ is known as the matrix associated with the quadratic function.

Remark 10.8 (Gradient and Hessian of a quadratic function)

Let

f (x) = \frac{1}{2} x^{T} A x + b^{T} x + c

be a quadratic function. Then, the gradient is given by:

\nabla f (x) = A x + b .

And, the Hessian is given by:

\nabla^{2} f (x) = A .

See Example 5.8 and Example 5.13 for reference.

10.12.1.1. Stationary Points#

Theorem 10.86 (Stationary points of quadratic functions)

Let a quadratic function $f : R^{n} \to R$ be given by

f (x) = \frac{1}{2} x^{T} A x + b^{T} x + c

where $A \in S^{n}$ , $b \in R^{n}$ and $c \in R$ .

$x \in R^{n}$ is a stationary point if and only if $A x = - b$ .
If $A ⪰ O$ , then $x$ is a global minimum point of $f$ if and only if $A x = - b$ .
If $A succ O$ , then $x = - A^{- 1} b$ is a strict global minimum point of $f$ .
If $A succ O$ , then the minimum value of $f$ is $c - \frac{1}{2} b^{T} A^{- 1} b$ .

Proof. (1) is a direct implication of the fact that $\nabla f (x) = 0$ if and only if $A x + b = 0$ .

(2) We are given that $\nabla^{2} f (x) = A ⪰ O$ .

Thus, $\nabla^{2} f (x) ⪰ O$ for every $x \in R^{n}$ .
By Theorem 8.4, if $x$ is a stationary point of $f$ , then it is a global minimum point.
By first part, $x$ is a stationary point if and only if $A x = - b$ .

(3) We are given that $A succ O$ .

Then, $A$ is invertible.
Hence, $x = - A^{- 1} b$ is the unique solution to the equation $A x = - b$ .
By parts (1) and (2), it is the unique (hence strict) global minimizer of $f$ .

(4) We know that strict global minimum point of $f$ is given by $a = - A^{- 1} b$ with $A a = - b$ . Therefore,

\begin{aligned} f (a) & = \frac{1}{2} a^{T} A a + b^{T} a + c \\ = - \frac{1}{2} a^{T} b + b^{T} a + c \\ = c + \frac{1}{2} b^{T} a \\ = c - \frac{1}{2} b^{T} A^{- 1} b . \end{aligned}

10.12.1.2. Coerciveness#

Theorem 10.87 (Coerciveness of quadratic functions)

Let a quadratic function $f : R^{n} \to R$ be given by

f (x) = \frac{1}{2} x^{T} A x + b^{T} x + c

where $A \in S^{n}$ , $b \in R^{n}$ and $c \in R$ .

$f$ is coercive if and only if $A succ O$ ; i.e., $A$ is positive definite.

Proof. Assume that $A$ is positive definite.

Then, all eigenvalues of $A$ are positive.
Let $λ$ be the smallest eigenvalue of $A$ .
Then, $x^{T} A x \geq λ ‖ x ‖^{2}$ for every $x \in R^{n}$ .
Thus,

$\begin{aligned} f (x) & \geq \frac{λ}{2} ‖ x ‖^{2} + b^{T} x + c \\ \geq \frac{λ}{2} ‖ x ‖^{2} - ‖ b ‖ ‖ x ‖ + c & Cauchy Schwartz inequality \\ = \frac{λ}{2} ‖ x ‖ (‖ x ‖ - \frac{2}{λ} ‖ b ‖) + c . \end{aligned}$
We can see that $f (x) \to \infty$ as $‖ x ‖ \to \infty$ .
Thus, $f$ is coercive.

Now, assume that $f$ is coercive.

We need to show that $A$ must be positive definite.
Thus, all eigenvalues of $A$ must be positive.
For contradiction, assume that an eigenvalue of $A$ is negative.
Let $λ < 0$ be such an eigenvalue with the corresponding normalized eigenvector $v$ such that $A v = λ v$ .
Then, for any $t \in R$ ,

$\begin{aligned} f (t v) & = \frac{t^{2}}{2} v^{T} A v + t b^{T} v + c \\ = \frac{λ t^{2}}{2} + t b^{T} v + c . \end{aligned}$
Clearly, $f (t v) \to - \infty$ as $t \to \infty$ since $λ$ is negative.
Thus, it contradicts the hypothesis that $f$ is coercive.
We now consider the possibility where there is a 0 eigenvalue.
Then, there exists a normalized eigenvector $v$ such that $A v = 0$ .
Then, for any $t \in R$ ,

$f (t v) = t b^{T} v + c .$
If $b^{T} v = 0$ , then $f (t v) = c$ for every $t \in R$ .
If $b^{T} v > 0$ , then $f (t v) \to - \infty$ as $t \to - \infty$ .
If $b^{T} v < 0$ , then $f (t v) \to - \infty$ as $t \to \infty$ .
In all the three cases, $f (t v)$ does not go to $\infty$ as $‖ t v ‖ \to \infty$ .
Thus, $f$ is not coercive. A contradiction to the hypothesis.
Hence, the eigenvalues of $A$ must be positive.
Hence, $A$ must be positive definite.

10.12.1.3. Nonnegative Quadratics#

It is useful to work with quadratic functions which are nonnegative on the entire $R^{n}$ .

The basic quadratic form $f (x) = \frac{1}{2} x^{T} A x$ is nonnegative on entire $R^{n}$ if $A$ is positive semidefinite.

For the general quadratic function, we need to incorporate the contribution from $b$ and $c$ terms also.

Theorem 10.88 (Nonnegativity of quadratic function)

Let a quadratic function $f : R^{n} \to R$ be given by

f (x) = \frac{1}{2} x^{T} A x + b^{T} x + c

where $A \in S^{n}$ , $b \in R^{n}$ and $c \in R$ .

The following statements are equivalent.

$f (x) \geq 0$ for every $x \in R^{n}$ .
$[\begin{matrix} A & b \\ b^{T} & 2 c \end{matrix}] ⪰ O$ ; i.e., this $n + 1 \times n + 1$ symmetric matrix is positive semidefinite.

Proof. Assume that (2) is true.

Then, for every $x \in R^{n}$

$\begin{array}{r} {[\begin{array}{c} x \\ 1 \end{array}]}^{T} [\begin{array}{c} A & b \\ b^{T} & 2 c \end{array}] [\begin{array}{c} x \\ 1 \end{array}] \geq 0 \end{array}$

due to positive semidefiniteness.
But

$\begin{aligned} {[\begin{array}{c} x \\ 1 \end{array}]}^{T} [\begin{array}{c} A & b \\ b^{T} & 2 c \end{array}] [\begin{array}{c} x \\ 1 \end{array}] & = [\begin{array}{c} x^{T} & 1 \end{array}] [\begin{array}{c} A x + b \\ b^{T} x + 2 c \end{array}] \\ = x^{T} A x + 2 x^{T} b + 2 c \\ = 2 (\frac{1}{2} x^{T} A x + b^{T} x + c) \\ = 2 f (x) . \end{aligned}$
Thus, $f (x) \geq 0$ for every $x \in R^{n}$ .

For the converse, assume (1) is true.

We need to show that $[\begin{matrix} A & b \\ b^{T} & 2 c \end{matrix}]$ is positive semidefinite.
We shall first show that $A$ is positive semidefinite.
For contradiction, assume that $A$ is not positive semidefinite.
Then, there exists a negative eigenvalue $λ < 0$ and corresponding normalized eigenvector $v$ for $A$ such that $A v = λ v$ .
Then, for any $t \in R$

$f (t v) = \frac{λ t^{2}}{2} + t b^{T} v + c .$
Then, $f (t v) \to - \infty$ as $t \to - \infty$ .
This contradicts the hypothesis that $f$ is nonnegative everywhere.
Thus, $A$ must be positive semidefinite.
We now need to show that for any $y \in R^{n}$ and any $t \in R$ ,

$\begin{array}{r} {[\begin{array}{c} y \\ t \end{array}]}^{T} [\begin{array}{c} A & b \\ b^{T} & 2 c \end{array}] [\begin{array}{c} y \\ t \end{array}] \geq 0. \end{array}$
This condition is equivalent to

$\frac{1}{2} y^{T} A y + t b^{T} y + c t^{2} \geq 0$

for every $y \in R^{n}$ and $t \in R$ .
If $t = 0$ , then this condition reduces to

$y^{T} A y \geq 0 \forall y \in R^{n} .$
This is valid for every $y \in R^{n}$ since $A$ is p.s.d.. as established earlier.
For $t \neq 0$ , we have

$t^{2} f (\frac{y}{t}) = t^{2} (\frac{1}{2 t^{2}} y^{T} A y + \frac{1}{t} b^{T} y + c) = \frac{1}{2} y^{T} A y + t b^{T} y + c t^{2} .$
By hypothesis, $t^{2} f (\frac{y}{t}) \geq 0$ for every $y \in R^{n}$ and $t \neq 0$ .
Thus, $\frac{1}{2} y^{T} A y + t b^{T} y + c t^{2} \geq 0$ for every $y \in R^{n}$ and $t \in R$ .
Thus, $[\begin{matrix} A & b \\ b^{T} & 2 c \end{matrix}]$ is indeed p.s.d..

10.12.2. Quadratic Optimization Problems#

We consider the following possibilities:

The objective function is a quadratic function.
Both the objective function as well as the inequality constraints function are quadratic.

10.12.2.1. Quadratic Program#

Definition 10.42 (Quadratic program)

A convex optimization problem is known as a quadratic program (QP) if the objective function is a (convex) quadratic and the constraint functions are affine.

A general quadratic program has the following form:

(10.30)#

\begin{aligned} minimize & \frac{1}{2} x^{T} P x + q^{T} x + r \\ subject to & G x ⪯ h \\ A x = b \end{aligned}

where

$x \in R^{n}$ is the optimization variable.
$P \in S_{+}^{n}$ is a symmetric positive semidefinite matrix.
$q \in R^{n}$ and $r \in R$ .
$f (x) = \frac{1}{2} x^{T} P x + q^{T} x + r$ is a (convex) quadratic objective function.
$G \in R^{m \times n}$ and $h \in R^{m}$ describe the $m$ affine inequality constraints.
$A \in R^{p \times n}$ and $b \in R^{p}$ describe the $p$ affine equality constraints.

10.12.2.2. Quadratically Constrained Quadratic Program#

Definition 10.43 (Quadratically constrained quadratic program)

A convex optimization problem is known as a quadratically constrained quadratic program (QCQP) if the objective function and the inequality constraint functions are (convex) quadratic while the equality constraint functions are affine.

A general quadratic program has the following form:

(10.31)#

\begin{aligned} minimize & \frac{1}{2} x^{T} P_{0} x + q_{0}^{T} x + r_{0} \\ subject to & \frac{1}{2} x^{T} P_{i} x + q_{i}^{T} x + r_{i} \leq 0 & i = 1, \dots, m \\ A x = b \end{aligned}

where

$x \in R^{n}$ is the optimization variable.
$P_{i} \in S_{+}^{n}$ are symmetric positive semidefinite matrices for $i = 0, \dots, m$ .
$q_{i} \in R^{n}$ and $r_{i} \in R$ for $i = 0, \dots, m$ .
$f_{0} (x) = \frac{1}{2} x^{T} P_{0} x + q_{0}^{T} x + r_{0}$ is a (convex) quadratic objective function.
$f_{i} (x) = \frac{1}{2} x^{T} P_{i} x + q_{i}^{T} x + r_{i}$ are (convex) quadratic inequality constraint functions for $i = 1, \dots, m$ .
$A \in R^{p \times n}$ and $b \in R^{p}$ describe the $p$ affine equality constraints.

Topics in Signal Processing

Quadratic Programming

Contents

10.12. Quadratic Programming#

10.12.1. Quadratic Functions#

10.12.1.1. Stationary Points#

10.12.1.2. Coerciveness#

10.12.1.3. Nonnegative Quadratics#

10.12.2. Quadratic Optimization Problems#

10.12.2.1. Quadratic Program#

10.12.2.2. Quadratically Constrained Quadratic Program#