10.1. Convex Optimization#

Main references for this section are [6, 9, 17].

10.1.1. Convex Optimization Problems#

Definition 10.1 (Convex optimization problem general form)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(f : \VV \to \RR\) be a convex function with \(S = \dom f\). Let \(C \subseteq S \subseteq \VV\) be a convex set.

A mathematical optimization problem of the form

\[\begin{split} & \text{minimize } & & f(\bx) \\ & \text{subject to } & & \bx \in C \end{split}\]

is known as a convex optimization problem.

In other words, a convex optimization problem minimizes a convex function over a convex set. We can write the optimal value of the minimization problem as

\[ p^* = \inf \{ f(\bx) \ST \bx \in C \}. \]

The set of optimal points for the minimization problem is

\[ X_{\text{opt}} = \{ \bx \in C \ST f(\bx) = p^* \}. \]

Note

  1. \(f\) is the objective function (or cost function) being minimized.

  2. \(\bx\) is the optimization variable.

  3. \(C\) is the feasible set. Any \(\bx \in C\) is a feasible point.

  4. The problem is feasible if \(C \neq \EmptySet\).

  5. If \(C = \EmptySet\), then the problem is infeasible.

  6. If the problem is infeasible then \(p^* = \infty\).

  7. If the problem is feasible then \(p^* < \infty\).

  8. If \(p^* = -\infty\), then the problem is unbounded below.

  9. We say that \(\bx^*\) is an optimal point for the minimization problem if \(f(\bx^*) = p^*\).

  10. If \(X_{\text{opt}}\) is nonempty, then we say that the optimal value \(p^*\) is attained at some point.

  11. It is possible that \(p^*\) is finite and yet the optimal set is empty.

Remark 10.1 (Closedness of the feasible set)

Some authors prefer to define convex optimization as the minimization of a convex function over a closed and convex set.

  1. We have not insisted that \(C\) be a closed set in the general definition of convex optimization problem.

  2. The closedness of \(C\) is an additional property which can provide additional guarantees on the existence of an optimal point. We shall introduce it whenever needed in the discussion below.

  3. Recall from Theorem 9.170 that a closed and convex set \(C\) is an intersection of all the halfspaces that contain it. Let \(\{ A_i \}_{i \in I}\) be the set of halfspaces that contains \(C\).

  4. Then, each half space can be written as

    \[ A_i = \{ \bx \in \VV \ST \langle \bx, \ba_i \rangle \leq b_i \}. \]
  5. Thus, \(\bx \in C\) is equivalent to \(\langle \bx, \ba_i \rangle \leq b_i\) for every \(i \in I\).

  6. This gives us the motivation to consider the feasible set in the form of intersection of sublevel sets of convex functions.

Majority of convex optimization problems can be transformed into a functional form where the constraints are expressed in the form of sublevel sets of convex functions and the level sets of affine functions.

10.1.1.1. Convex Optimization Standard Form#

Definition 10.2 (Convex optimization problem standard form)

Let \(\VV\) be an \(n\)-dimensional real vector space. A mathematical optimization problem of the form

(10.1)#\[\begin{split}& \text{minimize } & & f_0(\bx) \\ & \text{subject to } & & f_i(\bx) \leq 0, & \quad i=1,\dots,m\\ & & & h_j(\bx) = 0, & \quad j=1,\dots,p\end{split}\]

with optimization variable \(\bx \in \VV\) is called a convex optimization problem in standard form if

  1. The objective function \(f_0: \VV \to \RR\) is a convex function.

  2. The inequality constraint functions \(f_i: \VV \to \RR\) are convex functions for \(i=1,\dots,m\).

  3. The equality constraint functions \(h_j: \VV \to \RR\) are affine functions for \(j=1,\dots,p\).

Observation 10.1 (Discussion on the convex optimization standard form)

  1. The domain of the problem is given by

    \[ \DDD = \dom f_0 \cap \bigcap_{i=1}^m \dom f_i \cap \bigcap_{j=1}^p \dom h_j. \]
  2. \(\dom h_j = \VV\) for every \(j=1,\dots,p\) since \(h_j\) are affine functions.

  3. Thus,

    \[ \DDD = \dom f_0 \cap \bigcap_{i=1}^m \dom f_i. \]
  4. By definition \(\dom f_i\) are convex for \(i=0,\dots,m\). Hence \(\DDD\) is convex.

  5. The feasible set \(C\) is given by

    \[ C = \dom f_0 \cap \bigcap_{i=1}^m f_i^{-1}(-\infty, 0] \cap \bigcap_{j=1}^p h_j^{-1}(0). \]
  6. Then, \(f_i^{-1}(-\infty, 0]\) are sublevel sets of convex functions hence they are also convex sets.

  7. By Theorem 4.209, the level sets of affine functions are affine sets.

  8. Thus, \(h_j^{-1}(0)\) is an affine set. Hence, it is a convex set.

  9. Thus, \(C\) is an intersection of convex sets.

  10. Thus, \(C\) is a convex set.

  11. We note that we can rewrite the standard form as

    \[\begin{split} & \text{minimize } & & f_0(\bx) \\ & \text{subject to } & & \bx \in C \end{split}\]

    to match with Definition 10.1.

Remark 10.2 (Adding closedness of objective and constraint functions)

Suppose we add an additional requirement that the function \(f_i\) for \(i=0,\dots,m\) are closed.

  1. Recall from Definition 3.64 that a function is closed if all its sublevel sets are closed.

  2. In particular, this means that the domain of a closed function is also closed.

  3. Thus, \(\DDD\) is a closed set.

  4. Then, \(f_i^{-1}(-\infty, 0]\) are closed sets since \(f_i\) are closed functions.

  5. Since \(\VV\) is finite dimensional, hence affine sets are closed. Hence \(h_j^{-1}(0)\) is closed for every \(j=1,\dots,p\).

  6. Thus, \(C\) is an intersection of closed and convex sets.

  7. Thus, \(C\) is a closed and convex set.

  8. Recall from Theorem 3.92 that a function is closed if and only if it has a closed epigraph.

  9. Also, recall from Theorem 3.119 that a function is closed if and only if it is l.s.c. (lower semicontinuous).

  10. Further, recall from Theorem 9.172 that every convex function has a closure which is l.s.c. (hence closed) given by its lower semicontinuous hull and the closure of a convex function is also convex.

  11. Thus, if any of the \(f_i\) for \(i=0,\dots,m\) is convex but not a closed function, we can replace it by its lower semicontinuous hull to convert it to a closed convex function. This way, be can bring such problems to the standard form for convex optimization.

Remark 10.3 (Comparison of two forms)

We have seen two forms of describing convex optimization problems.

  1. Definition 10.1 describes it as minimizing a convex function over a convex set.

  2. Definition 10.2 describes it in the form of minimizing a convex objective function with convex inequality constraints and affine equality constraints.

  3. As seen in comments above, the second form can be converted into the first form.

  4. The first form is more general. It is very useful in establishing the theoretical properties of convex optimization problems.

  5. The second form is more useful from practical perspective. Almost all real life convex optimization problems can be reduced to this form. It is easier to describe algorithms to solve convex optimization problems in terms of this form.

In the sequel, we will liberally use either form for proving theoretical results and developing algorithms.

Definition 10.3 (Concave function maximization problem)

Let \(\VV\) be an \(n\)-dimensional real vector space. A mathematical optimization problem of the form

(10.2)#\[\begin{split}& \text{maximize } & & f_0(\bx) \\ & \text{subject to } & & f_i(\bx) \leq 0, & \quad i=1,\dots,m\\ & & & h_j(\bx) = 0, & \quad j=1,\dots,p\end{split}\]

with optimization variable \(\bx \in \VV\) is called a concave maximization problem if

  1. The objective function \(f_0: \VV \to \RR\) is a concave function.

  2. The inequality constraint functions \(f_i: \VV \to \RR\) are convex functions for \(i=1,\dots,m\).

  3. The equality constraint functions \(h_j: \VV \to \RR\) are affine functions for \(j=1,\dots,p\).

Remark 10.4 (Concave maximization as a convex optimization problem)

We note that maximizing \(f_0\) is same as minimizing \(-f_0\). Further, if \(f_0\) is concave then \(-f_0\) is convex. Thus, (10.2) is equivalent to the problem

\[\begin{split} & \text{minimize } & & -f_0(\bx) \\ & \text{subject to } & & f_i(\bx) \leq 0, & \quad i=1,\dots,m\\ & & & h_j(\bx) = 0, & \quad j=1,\dots,p \end{split}\]

which is a convex optimization problem in standard form. With an abuse of notation, we shall call a concave function maximization program also a convex optimization program.

10.1.1.2. Proper Convex Functions#

It is useful to extend these definitions to proper convex functions as objective functions and/or inequality constraint functions.

Definition 10.4 (Optimization problem general form for proper convex functions)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(f : \VV \to \RERL\) be a proper convex function with \(S = \dom f\). Let \(X \subseteq \VV\) be a convex set.

A mathematical optimization problem of the form

\[\begin{split} & \text{minimize } & & f(\bx) \\ & \text{subject to } & & \bx \in X \end{split}\]

is known as a convex optimization problem for proper convex functions.

In other words, this problem minimizes a proper convex function over a convex set. We can write the optimal value of the minimization problem as

\[ p^* = \inf \{ f(\bx) \ST \bx \in X \}. \]

The set of feasible points is given by \(C = X \cap \dom f\) which is a convex set.

The set of optimal points for the minimization problem is

\[ X_{\text{opt}} = \{ \bx \in C \ST f(\bx) = p^* \}. \]

Definition 10.5 (Convex optimization problem standard form for proper convex functions)

Let \(\VV\) be an \(n\)-dimensional real vector space. A mathematical optimization problem of the form

(10.3)#\[\begin{split}& \text{minimize } & & f_0(\bx) \\ & \text{subject to } & & f_i(\bx) \leq 0, & \quad i=1,\dots,m\\ & & & h_j(\bx) = 0, & \quad j=1,\dots,p\end{split}\]

with optimization variable \(\bx \in \VV\) is called a convex optimization problem in standard form for proper convex functions if

  1. The objective function \(f_0: \VV \to \RERL\) is a proper convex function.

  2. The inequality constraint functions \(f_i: \VV \to \RERL\) are proper convex functions for \(i=1,\dots,m\).

  3. The equality constraint functions \(h_j: \VV \to \RR\) are affine functions for \(j=1,\dots,p\).

The reader can check that the definitions of problem domain, feasible set, optimal value, optimal solutions/points, etc. can be easily extended by using the effective domains of \(f_0, \dots, f_m\).

10.1.1.3. Local and Global Optima#

Theorem 10.1 (Local minimum is a global minimum in convex optimization)

Let \(f : \VV \to \RR\) be a convex function. Let \(C \subseteq \dom f\) be a convex set. Consider the problem of minimizing \(f\) over \(C\). Let \(\bx^*\) be locally optimal for \(f\) over \(C\). Then, \(\bx^*\) is globally optimal for \(f\) over \(C\).

In other words,

\[ f(\by) \geq f(\bx^*) \Forall \by \in C. \]

Proof. Since \(\bx^*\) is a local minimum of \(f\) over \(C\), hence there exists \(r > 0\) such that

\[ f(\bx^*) = \inf \{f(\bz) \ST \bz \in B[\bx, r] \cap C \}. \]

In other words, \(f(\bx) \geq f(\bx^*)\) for every \(\bx \in B[\bx, r] \cap C\).

  1. Let \(\by \in C\) be any point such that \(\by \neq \bx^*\).

  2. We need to show that \(f(\by) \geq f(\bx^*)\).

  3. Let \(t \in (0,1]\) be such that \(\bz = \bx^* + t(\by - \bx^*) \in B[\bx, r]\).

  4. Since \(C\) is convex, hence \(\bz \in C\) as \(\bx^*, \by \in C\) and \(\bz\) is their convex combination.

  5. Thus, \(\bz \in B[\bx, r] \cap C\).

  6. By the local optimality condition

    \[\begin{split} f(\bx^*) &\leq f(\bz) \\ &= f(\bx^* + t(\by - \bx^*)) \\ &= f( (1 - t) \bx^* + t \by) \\ &\leq (1 -t )f(\bx^*) + t f(\by). \end{split}\]

    We used the fact that \(f\) is convex.

  7. Cancelling and rearranging the terms, we get \(t f(\bx^*) \leq t f(\by)\).

  8. Since \(t > 0\), hence it reduces to \(f(\bx^*) \leq f(\by)\).

  9. Thus, \(\bx^*\) is indeed globally optimal.

Corollary 10.1 (Local minimum is a global minimum in for proper convex functions)

Let \(f : \VV \to \RERL\) be a proper convex function. Let \(X\) be a convex subset of \(\VV\). Consider the problem of minimizing \(f\) over \(X\). Let \(\bx^*\) be locally optimal for \(f\) over \(X\). Then, \(\bx^*\) is globally optimal for \(f\) over \(X\).

In other words,

\[ f(\by) \geq f(\bx^*) \Forall \by \in X. \]

Proof. We note that the feasible set is \(C = X \cap \dom f\). Since both \(X\) and \(\dom f\) are convex, hence \(C\) is convex.

  1. If \(C = \EmptySet\), then there are no feasible points. Hence, no local/global optima.

  2. We now consider the case where \(C \neq \EmptySet\).

  3. Following the argument in Theorem 10.1, if \(\bx^*\) is locally optimal for \(f\) over \(C\), then it is globally optimal for \(f\) over \(C\).

  4. Consequently, it is globally optimal for \(f\) over \(X\) as \(f(\bx) = \infty > f(\bx^*)\) for every \(\bx \in X \setminus \dom f\).

The argument can be modified to show that if \(f\) is strictly convex, then a locally optimal point for \(f\) is strictly globally optimal point.

Theorem 10.2 (Local minimum is strict global minimum for strictly convex functions)

Let \(f : \VV \to \RR\) be a strictly convex function. Let \(C \subseteq \dom f\) be a convex set. Let \(\bx^*\) be locally optimal for \(f\) over \(C\). Then, \(\bx^*\) is strictly globally optimal for \(f\) over \(C\).

In other words,

\[ f(\by) > f(\bx^*) \Forall \by \in C. \]

Proof. There exists \(r > 0\) such that \(f(\bx) \geq f(\bx^*)\) for every \(\bx \in B[\bx, r] \cap C\).

  1. Let \(\by \in C\) be any point such that \(\by \neq \bx^*\).

  2. Let \(t \in (0,1)\) be such that \(\bz = \bx^* + t(\by - \bx^*) \in B[\bx, r]\).

  3. Since \(C\) is convex, hence \(\bz \in C\) as \(\bx^*, \by \in C\) and \(\bz\) is their convex combination.

  4. Thus, \(\bz \in B[\bx, r] \cap C\).

  5. Note that \(\bz = (1-t) \bx^* + t \by\). Thus, \(\bz\) is distinct from \(\bx^*\) and \(\by\) and lies in the line segment between them.

  6. By the local optimality condition

    \[\begin{split} f(\bx^*) &\leq f(\bz) \\ &= f( (1 - t) \bx^* + t \by) \\ &< (1 -t )f(\bx^*) + t f(\by). \end{split}\]

    We used the fact that \(f\) is strictly convex.

  7. Cancelling and rearranging the terms, we get \(t f(\bx^*) < t f(\by)\).

  8. Since \(t > 0\), hence it reduces to \(f(\bx^*) < f(\by)\).

  9. Thus, \(\bx^*\) is indeed strictly globally optimal.

Corollary 10.2 (Local minimum is strict global minimum for proper strictly convex functions)

Let \(f : \VV \to \RERL\) be a proper strictly convex function. Let \(X\) be a convex subset of \(\VV\). Let \(\bx^*\) be locally optimal for \(f\) over \(X\). Then, \(\bx^*\) is strictly globally optimal for \(f\) over \(X\).

In other words,

\[ f(\by) > f(\bx^*) \Forall \by \in X. \]

10.1.1.4. Optimal Sets#

The optimal sets of a convex optimization problem are also convex.

Theorem 10.3 (Optimal set is convex for a convex optimization problem)

Let \(f : \VV \to \RR\) be a convex function. Let \(C \subseteq \dom f\) be a convex set. Let the optimal value for the minimization of \(f\) over \(C\) be given by

\[ p^* = \inf \{ f(\bx) \ST \bx \in C \}. \]

Let the optimal set (the set of optimal points) be given by

\[ X_{\text{opt}} = \{ \bx \in C \ST f(\bx) = p^* \}. \]

Then, \(X_{\text{opt}}\) is a convex set.

Proof. If \(X_{\text{opt}}\) is empty, then it is convex trivially. Now consider the case where \(X_{\text{opt}}\) is nonempty.

  1. Let \(\bx, \by \in X_{\text{opt}}\).

  2. Then, \(\bx, \by \in C\).

  3. Let \(t \in [0, 1]\).

  4. Let \(\bz = t \bx + (1- t) \by\).

  5. Since \(C\) is convex, hence \(\bz \in C\). Thus, \(\bz \in \dom f\).

  6. Since \(f\) is convex, hence

    \[ f(\bz) \leq t f(\bx) + (1-t)f(\by) = t p^* + (1-t)p^* = p^*. \]
  7. But \(p^* = \inf \{ f(\bx) \ST \bx \in C \}\).

  8. Thus, \(f(\bz) \geq p^*\).

  9. Combining, the two inequalities, we get \(p^* = f(\bz)\).

  10. Thus, \(\bz \in X_{\text{opt}}\).

  11. Thus, for every \(\bx, \by \in X_{\text{opt}}\) and every \(t \in [0,1]\), \(\bz = t \bx + (1-t) \by \in X_{\text{opt}}\).

  12. Thus, \( X_{\text{opt}}\) is indeed convex.

Theorem 10.4 (Optimal points for minimization of strictly convex functions)

Let \(f : \VV \to \RR\) be a strictly convex function. Let \(C \subseteq \dom f\) be a convex set. Then, the minimization problem

\[\begin{split} & \text{minimize } & & f(\bx) \\ & \text{subject to } & & \bx \in C \end{split}\]

has at most one optimal point.

Proof. Let the optimal value for the minimization of \(f\) over \(C\) be given by

\[ p^* = \inf \{ f(\bx) \ST \bx \in C \}. \]

Let the optimal set (the set of optimal points) be given by

\[ X_{\text{opt}} = \{ \bx \in C \ST f(\bx) = p^* \}. \]
  1. By Theorem 10.1, \(X_{\text{opt}}\) is a convex set.

  2. If \(X_{\text{opt}}\) is empty or a singleton, there is nothing more to prove.

  3. For contradiction, assume that there are two distinct points \(\bx, \by \in X_{\text{opt}}\).

  4. We have \(p^* = f(\bx) = f(\by)\).

  5. Let \(\bz = \frac{1}{2} \bx + \frac{1}{2} \by\).

  6. Thus, it is a convex combination of \(\bx\) and \(\by\).

  7. By convexity of \(X_{\text{opt}}\), \(\bz \in X_{\text{opt}}\). Thus, \(f(\bz) = p^*\).

  8. By strict convexity of \(f\)

    \[ f(\bz) < \frac{1}{2} f(\bx) + \frac{1}{2} f(\by) = p^*. \]
  9. This contradicts the fact that \(p^*\) is the optimal value for the minimization problem.

  10. Thus, \(X_{\text{opt}}\) must be either empty or a singleton.

  11. Thus, the minimization problem has at most one optimal point.

10.1.2. Concave Objective Functions#

Theorem 10.5 (Minimization of concave function and relative interior)

Let \(\VV\) be an \(n\)-dimensional real normed linear space. Let \(f : \VV \to \RR\) be a concave function with \(S = \dom f\). Let \(C \subseteq S\) be a convex set. Consider the problem of minimizing \(f\) over \(C\). Let the optimal value be given by

\[ p^* = \inf \{ f(\bx) \ST \bx \in C \}. \]

Let the set of optimal points for the minimization problem be given by

\[ X_{\text{opt}} = \{ \bx \in C \ST f(\bx) = p^* \}. \]

If \(X_{\text{opt}}\) contains a relative interior point of \(C\), then \(f\) must be constant over \(C\); i.e.,

\[ X_{\text{opt}} = C. \]

Proof. Assume that \(\bx \in \relint C \cap X_{\text{opt}}\).

  1. Let \(\by \in C\) be any vector distinct from \(\bx\).

  2. Since \(\bx \in \relint C\), hence due to Theorem 9.144, there exists \(s > 1\) such that \(\bz = \bx + (s -1) (\bx - \by) \in C\). \(\bz\) is a point behind \(\bx\) on the line from \(\by\) to \(\bx\) which belongs to \(C\).

  3. Letting \(t = \frac{1}{s}\) we can rewrite it as

    \[ \bx = t \bz + (1-t) \by. \]
  4. Thus, \(\bx\) is a convex combination of \(\bz\) and \(\by\).

  5. By concavity of \(f\), we have

    \[ f(\bx) \geq t f(\bz) + (1-t) f(\by). \]
  6. Since \(\bx\) is an optimal point over \(C\) and \(\bz, \by \in C\), hence \(f(\bx) \leq f(\bz)\) and \(f(\bx) \leq f(\by)\).

  7. Thus,

    \[ f(\bx) \geq t f(\bz) + (1-t) f(\by) \geq t f(\bx) + (1-t) f(\bx) = f(\bx). \]
  8. This can be true only if \(f(\bx) = t f(\bz) + (1-t) f(\by)\) which in turn implies that \(f(\bx) = f(\bz) = f(\by)\).

  9. Thus, \(f\) must be constant over \(C\).

Corollary 10.3 (Linear functions attain minimum only at boundary points)

Let \(\VV\) be an \(n\)-dimensional real normed linear space. Let \(f : \VV \to \RR\) be a nonzero linear functional. Let \(C \subseteq \VV\) be a convex set. Then, \(f\) cannot attain a minimum at any relative interior point of \(C\).

In other words, \(f\) can attain a minimum only at the relative boundary of \(C\) (if it does attain a minimum over \(C\)).

Proof. We note that every linear functional is also concave.

  1. Let \(\bx \in \relint C\).

  2. Then, there exists \(r > 0\) such that \(B(\bx, r) \cap \affine C \subseteq C\).

  3. Since \(f\) is linear, hence \(f\) cannot be constant over \(B(\bx, r) \cap \affine C\).

  4. Thus, by Theorem 10.5, it cannot attain a minimum at \(\bx\).