# Basic Duality

## Contents

# 10.4. Basic Duality#

## 10.4.1. The Geometry of the \(\VV \oplus \RR\) Space#

As we concern ourselves with the optimization of proper functions of type \(f: \VV \to \RERL\), it is pertinent for us to introduce some terminology to explore the geometry of the \(\VV \oplus \RR\) space. We recall that \(\epi f \subseteq \VV \oplus \RR\).

The \(\VV\) part forms the horizontal axes for the product space \(\VV \oplus \RR\).

The \(\RR\) part forms the vertical axis.

We write the vectors in \(\VV \oplus \RR\) as \((\bx, t)\) where \(\bx \in \VV\) and \(t \in \RR\).

A hyperplane in \(\VV \oplus \RR\) is associated with a nonzero normal vector of the form \((\ba, b)\) where \(\ba \in \VV\) and \(b \in \RR\).

\[ H = \{ (\bx, t) \in \VV \oplus \RR \ST \langle \bx, \ba \rangle + t b = c \}. \]Since the normal vector must be nonzero, hence either \(\ba\) or \(b\) or both must be nonzero.

If a specific point \((\bx_0, t_0) \in H\), then

\[ H = \{ (\bx, t) \in \VV \oplus \RR \ST \langle \bx, \ba \rangle + t b = \langle \bx_0, \ba \rangle + t_0 b \}. \]

### 10.4.1.1. Different Types of Hyperplanes#

(Vertical, horizontal and nonvertical hyperplanes)

Let \(H\) be a hyperplane of \(\VV \oplus \RR\) with a normal vector \((\ba, b)\).

The hyperplane is called

*horizontal*if \(\ba = \bzero\).The hyperplane is called

*vertical*if \(b = 0\).The hyperplane is called

*nonvertical*if \(b \neq 0\).

Let us see how these definitions can be interpreted.

### 10.4.1.2. Horizontal Hyperplanes#

(Horizontal hyperplanes)

Consider the case where \(\ba = \bzero\).

The hyperplane description reduces to

\[ H = \{ (\bx, t) \in \VV \oplus \RR \ST t b = c \}. \]It simplifies to

\[ H = \left \{ (\bx, t) \in \VV \oplus \RR \ST t = \frac{c}{b} \right \} \]since \(b\) must be nonzero.

Along the \(\VV\) axes, the points in set \(H\) can take any value but along the vertical axis, they must take a fixed value given by \(\frac{c}{b}\).

We can see that \(H\) is a hyperplane which is parallel to \(\VV \times \{ 0 \}\).

For the specific case where \(c = 0\), \(H = \VV \times \{ 0 \}\).

Hence they are called horizontal hyperplanes.

Note that \(H\) intersects with the vertical axis at the point \((\bzero, \frac{c}{b})\).

### 10.4.1.3. Vertical Hyperplanes#

(Vertical hyperplanes)

Now consider the case where \(b = 0\).

The hyperplane description reduces to

\[ H = \{ (\bx, t) \in \VV \oplus \RR \ST \langle \bx, \ba \rangle = c \}. \]The set \(H_v = \{ \bx \in \VV \ST \langle \bx, \ba \rangle = c \}\) describes a hyperplane of \(\VV\).

\(H\) is constructed by allowing \(H_v\) to slide along the vertical axis as any value is allowed in the last coordinate (vertical axis).

Hence this is called a vertical hyperplane.

### 10.4.1.4. Nonvertical Hyperplanes#

(Intersection of nonvertical hyperplane with vertical axis)

If a hyperplane \(H\) with the normal vector \((\ba, b)\) is nonvertical (i.e., \(b \neq 0\)), then it intersects with the vertical axis at a unique point.

Indeed the vertical axis is identified with the set of points

\[ L = \{ (\bx, t) \in \VV \oplus \RR \ST \bx = \bzero \}. \]The hyperplane is given by

\[ H = \{ (\bx, t) \in \VV \oplus \RR \ST \langle \bx, \ba \rangle + t b = c \}. \]Then the point of intersection is given by \((\bzero, t)\) where

\[ t = \frac{c}{b}. \]In particular, assume that \((\bx, s) \in H\).

Then by definition of \(H\),

\[ c = \langle \bx, \ba \rangle + s b. \]Hence

\[ t = \frac{1}{b} \langle \bx, \ba \rangle + s. \]

### 10.4.1.5. Vertical Lines#

(Vertical line)

A *vertical line* in \(\VV \oplus \RR\) is a set of the
form \(\{ (\bx, t) \ST t \in \RR \}\) where \(\bx \in \VV\)
is a fixed vector.

A vertical hyperplane is a union of vertical lines.

A vertical halfspace is also a union of vertical lines.

If \(f\) is a proper function, then its epigraph doesn’t contain a vertical line.

It enables us to wonder if the epigraph of a proper function is contained entirely in a closed halfspace corresponding to some nonvertical hyperplane.

### 10.4.1.6. Nonvertical Supporting Hyperplanes#

(Convex sets and nonvertical supporting hyperplanes)

Let \(C\) be a nonempty convex subset of \(\VV \oplus \RR\) that contains no vertical lines. Then

\(C\) is contained in a closed halfspace corresponding to a nonvertical hyperplane; i.e., there exists a vector \(\ba \in \VV\), a nonzero scalar \(b \in \RR\), and a scalar \(c \in \RR\) such that

\[ \langle \bx, \ba \rangle + t b \geq c \Forall (\bx, t) \in C. \]If a point \((\bar{\bx}, \bar{t})\) doesn’t belong to \(\closure C\), there exists a nonvertical hyperplane strongly separating \((\bar{\bx}, \bar{t})\) and \(C\).

Proof. (1) We prove this claim by contradiction.

Assume that every hyperplane such that \(C\) is contained in one of its closed half-spaces is vertical.

Then every hyperplane such that \(\closure C\) is contained in one of its closed half spaces is vertical.

By Theorem 9.170, \(\closure C\) is the intersection of all closed half-spaces that contain it.

Hence, we have

\[ \closure C = \bigcap_{i \in I}\{ (\bx, t) \ST \langle \bx, \ba_i \rangle \geq c_i \} \]where \(I\) is an index set for the family of the hyperplanes that contain \(\closure C\) in its closed half spaces above. Since the hyperplanes are vertical, hence \(b_i = 0\) for ever \(i \in I\).

Let \((\tilde{\bx}, \tilde{t}) \in \closure C\).

Then the vertical line \(\{ (\tilde{\bx}, t) \ST t \in \RR \}\) also belongs to \(\closure C\) as it satisfies the expression above.

Hence, the vector \((\bzero, 1)\) is a direction of recession for \(\closure C\). In fact it belongs to its lineality space. In other words, \((\bzero, -1)\) is also a direction of recession for \(\closure C\).

By Theorem 9.150,

\[ \relint \closure C = \relint C \]since \(C\) is a convex set.

By Theorem 9.182, the recession cones of \(\closure C\) and \(\relint C\) are equal.

Hence \((\bzero, 1)\) and \((\bzero, -1)\) are also directions of recession for \(\relint C\).

Hence for every \((\tilde{\bx}, \tilde{t}) \in \relint C\), the vertical line \(\{ (\tilde{\bx}, t) \ST t \in \RR \}\) also belongs to \(\relint C\) and hence to \(C\).

This contradicts the hypothesis that \(C\) doesn’t contain a vertical line.

(2) This follows from strict separation theorem.

Since \((\bar{\bx}, \bar{t}) \notin \closure C\), hence, due to Theorem 9.169, there exists a hyperplane strongly separating \(\closure C\) and \((\bar{\bx}, \bar{t})\).

If this hyperplane is nonvertical, then there is nothing more to do.

We now consider the case where this strongly separating hyperplane is vertical.

Let this separating hyperplane be given by

\[ H = \{ (\bx, t) \in \VV \oplus \RR \ST \langle \bx, \bar{\ba} \rangle = \bar{c} \} \]such that

\[ \langle \bx, \bar{\ba} \rangle > \bar{c} > \langle \bar{\bx}, \bar{\ba} \rangle \Forall (\bx, t) \in \closure C. \]Our goal is to combine this vertical hyperplane with a suitably constructed nonvertical hyperplane so that the combined nonvertical hyperplane also strongly separates \(C\) and \((\bar{\bx}, \bar{t})\).

By part (1), there exists a nonvertical hyperplane such that \(C\) is totally contained in one of its closed half spaces.

Hence there exists \((\hat{\ba}, \hat{b}) \in \VV \oplus \RR\) with \(\hat{b} \neq 0\) and \(\hat{c} \in \RR\) such that

\[ \langle \bx, \hat{\ba} \rangle + t \hat{b} \geq \hat{c} \Forall (\bx, t) \in \closure C. \]Multiply this inequality with some \(\epsilon > 0\) and add with the previous inequality to get

\[ \langle \bx, \bar{\ba} + \epsilon \hat{\ba} \rangle + \epsilon t \hat{b} > \bar{c} + \epsilon \hat{c} \Forall (\bx, t) \in \closure C, \Forall \epsilon > 0. \]Since \(\bar{c} > \langle \bar{\bx}, \bar{\ba} \rangle\), it is possible to select a small enough \(\epsilon > 0\) such that

\[ \bar{c} + \epsilon \hat{c} > \langle \bar{\bx}, \bar{\ba} + \epsilon \hat{\ba} \rangle + \epsilon \bar{t} \hat{b}. \]Thus, for the small enough \(\epsilon > 0\), we have

\[ \langle \bx, \bar{\ba} + \epsilon \hat{\ba} \rangle + \epsilon t \hat{b} > \bar{c} + \epsilon \hat{c} > \langle \bar{\bx}, \bar{\ba} + \epsilon \hat{\ba} \rangle + \epsilon \bar{t} \hat{b} \Forall (\bx, t) \in \closure C. \]Letting \(\tilde{\ba} = \bar{\ba} + \epsilon \hat{\ba}\), \(\tilde{b} = \epsilon \hat{b}\) and \(\tilde{c} = \bar{c} + \epsilon \hat{c}\), we have

\[ \langle \bx, \tilde{\ba} \rangle + t \tilde{b} > \tilde{c} > \langle \bar{\bx}, \tilde{\ba} \rangle + \bar{t} \tilde{b} \Forall (\bx, t) \in \closure C. \]This describes the strongly separating nonvertical hyperplane between \(\closure C\) and \((\bar{\bx}, \bar{t})\).

(Upper closed halfspace)

Let \(H\) be a nonvertical hyperplane.
The closed halfspace of \(H\) whose recession cone
includes the vertical halfline \(\{(\bzero, t) \ST t \geq 0 \}\)
is known as its *upper* closed halfspace.

If you are in the upper closed halfspace and keep going up, you will stay in the upper closed halfspace. If you go down, you will hit the hyperplane.

(Lower closed halfspace)

Let \(H\) be a nonvertical hyperplane.
The closed halfspace of \(H\) whose recession cone
includes the vertical halfline \(\{(\bzero, t) \ST t \leq 0 \}\)
is known as its *lower* closed halfspace.

If you are in the lower closed halfspace and keep going down, you will stay in the lower closed halfspace. If you go up, you will hit the hyperplane.

## 10.4.2. Min Common/Max Crossing Duality#

We introduce two simple optimization problems.

Consider a set \(M\) in \(\VV \oplus \RR\).

Assume that \(M\) intersects with the vertical axis; i.e., \(R = \{ (\bzero, t) \ST t \in \RR \}\).

The

*min common point*problem attempts to find the point \(\bx \in M \cap R\) whose component along vertical axis is the minimum.Now consider the set of nonvertical hyperplanes such that \(M\) lies in their upper closed halfspaces.

Each such hyperplane intersects with the vertical axis.

Such points are called the crossing points between the nonvertical hyperplane and the vertical axis.

Consider the set of all such crossing points.

The

*max crossing point*problem attempts to find the point whose vertical component is the largest (highest).Let \(\bp^*\) be the minimum common point.

Let \(\bq^*\) be the maximum crossing point.

Let \(p^*\) and \(q^*\) denote the component along the vertical axis for \(\bp^*\) and \(\bq^*\) respectively.

In general, \(p^*\) lies above \(q^*\). We shall show this later formally.

We call \(p^*\) as minimum common level and \(q^*\) as maximum crossing level.

Then \(p^* \geq q^*\). This is known as

*weak duality*.The gap \(p^* - q^*\) which is a nonnegative quantity is known as the

*duality gap*.Under certain conditions \(p^* = q^*\). In other words, if the set \(M\) meets specific conditions, then the optimal value for the min common point problem (the primal problem) as well as the max crossing point problem (the dual problem) are identical.

When \(p^* = q^*\), then the duality gap is \(0\). This is known as

*strong duality*.When strong duality holds, then the min common point problem and the max crossing point problem are equivalent in the sense that they have the same solution.

We are now ready to define these problems formally.

### 10.4.2.1. Min Common Problem#

(Min common problem)

Given a set \(M \subseteq \VV \oplus \RR\), the
*min common problem* is defined as

Its optimal value is denoted by \(p^*\); i.e.,

### 10.4.2.2. Max Crossing Problem#

Recall that a nonvertical hyperplane has a normal \((\ba, b)\) such that \(b \neq 0\). Then \((\frac{\ba}{b}, 1)\) is also a normal vector for this hyperplane. Hence one way to describe the set of nonvertical hyperplanes is the set of hyperplanes with normal vectors of the form \((\ba, 1)\). Following Observation 10.3, if a nonvertical hyperplane \(H\) intersects the vertical axis at some \((\bzero, q)\), then

where \((\bx, t) \in H\). Thus, the hyperplane can be characterized by \(\ba\) and \(q\) as

The corresponding upper closed half halfspace is given by

since \(t \geq q\) for every point on the vertical axis in \(H^u_{\ba, q}\). The coordinate along the vertical axis for all the points on the vertical axis in the upper half space must be more than or equal to \(q\).

If the set \(M\) lies in the upper closed half space of a hyperplane characterized by \(\ba\) and \(q\), then

Hence, the maximum crossing level \(q\) over all hyperplanes \(H_{\ba, q}\) with the same normal \((\ba, 1)\) is given by

The problem of maximizing the crossing level over all nonvertical hyperplanes is to maximize over all \(\ba \in \VV\), the maximum crossing level corresponding to \(\ba\).

(Max crossing problem)

Given a set \(M \subseteq \VV \oplus \RR\), the
*max crossing problem* is defined as

where \(q(\ba) = \inf_{(\bx, t) \in M} \{ \langle \bx, \ba \rangle + t \}\). Its optimal value is denoted by \(q^*\); i.e.,

The function \(q\) is the cost function (objective function) of the max crossing problem. It turns out that \(q\) is concave and upper semicontinuous.

\(q\))

(Concavity and upper semicontinuity ofThe cost function \(q\) of the max crossing problem is concave and upper semicontinuous over \(\VV\).

Proof. Concavity

For each \((\bx, t)\), the function \(\ba \mapsto \langle \bx, \ba \rangle + t\) is an affine function.

Thus \(q\) is an infimum of a family of affine functions over \((\bx, t) \in M\).

Hence \(q\) is concave.

Upper semicontinuity

We note that \(-q(\ba) = \sup_{(\bx, t) \in M} \{ - \langle \bx, \ba \rangle - t \}\).

Hence \(-q\) is a supremum of affine functions.

Affine functions are closed (Theorem 9.86).

Pointwise supremum of closed functions is closed (Theorem 3.93).

Hence \(-q\) is a closed function.

By Theorem 3.119, \(-q\) is lower semicontinuous since it is closed.

Hence \(q\) is upper semicontinuous.

### 10.4.2.3. Weak Duality#

(Weak duality theorem)

For the min common and max crossing problems we have

Proof. Recall that

Then

Thus we have

Taking the supremum over \(\ba \in \VV\) on the L.H.S., we obtain

### 10.4.2.4. Strong Duality#

We are now interested in conditions under which there is no duality gap and \(p^* = q^*\). We shall assume in general that the min common problem is feasible and \(p^* < \infty\).

(Optimal point of min common problem is a closure point)

Assume that the min common problem is feasible and let \(p^* < \infty\). Then the point \((\bzero, p^*)\) is a closure point of \(M\).

We have \(p^* = \inf_{(\bzero, t) \in M} \{ t \}\).

Thus for every \(\epsilon > 0\) there exists \((\bzero, t) \in M\) such that \(t < p^* + \epsilon\).

Thus for every \(\epsilon > 0\), there exists a point \((\bzero, t) \in M\) such that \(\| (\bzero, t) - (\bzero, p^*)\| < \epsilon\).

Hence \((\bzero, p^*)\) is a closure point of \(M\).

\(M\))

(Closed and convexIf \(M\) is closed and convex and admits a nonvertical supporting hyperplane at \((\bzero, p^*)\), then the strong duality holds and \(p^* = q^*\). Also, the optimal values of both the min common problem and max crossing problem are attained.

Since \(M\) is closed, hence \((\bzero, p^*) \in M\) since \((\bzero, p^*)\) is a closure point of \(M\).

Hence the optimal value of min common problem is attained at \((\bzero, p^*)\).

Let \(H\) be the nonvertical supporting hyperplane at \((\bzero, p^*)\).

Then intersection of \(H\) with the vertical axis is at \((\bzero, p^*)\).

Hence \(q^* \geq p^*\).

But by weak duality \(q^* \leq p^*\).

Hence \(q^* = p^*\).

Clearly the optimal value of max crossing problem is attained at \((\bzero, p^*)\) for the hyperplane \(H\).

This is the most favorable case where strong duality holds as well as the optimal values of both problems are attained. We next provide a result that provides a necessary and sufficient condition for the strong duality however does not address the issue of attainment of the optimal values.

### 10.4.2.5. First Min Common / Max Crossing Theorem#

(Min common/max crossing theorem I)

Consider the min common and max crossing problems. Assume the following:

\(p^* < \infty\); i.e., the min common problem is feasible.

The set

\[ \overline{M} = \{ (\bx, t) \in \VV \oplus \RR \ST \text{ there exists } \bar{t} \in \RR \text{ with } \bar{t} \leq t \text{ and } (\bx, \bar{t}) \in M \} \]is convex.

Then we have \(q^* = p^*\) if and only if for every sequence \(\{ (\bx_k, t_k) \}\) of \(M\) with \(\bx_k \to \bzero\), there holds

The set \(\overline{M}\) is an extension of the set \(M\) going upwards along the vertical axis. In other words, all points above \(M\) are included in \(\overline{M}\). In other words, the direction \((\bzero, 1)\) is added to the recession cone. \(\overline{M}\) is unbounded along the \((\bzero, 1)\) direction.

Proof. We first consider the trivial case where \(p^* = -\infty\).

By weak duality (Theorem 10.32), \(q^* \leq p^*\).

Hence \(q^* = -\infty\).

Hence \(q(\ba) = -\infty\) for every \(\ba \in \VV\).

The conclusion follows trivially.

We now consider the general case where \(p^* \in \RR\). First assume that \(p^* \leq \liminf_{k \to \infty} t_k\) holds for every sequence \(\{ (\bx_k, t_k) \}\) of \(M\) with \(\bx_k \to \bzero\).

Since \(M \subseteq \overline{M}\) and \((\bzero, p^*)\) is a closure point of \(M\), hence \((\bzero, p^*)\) is also a closure point of \(\overline{M}\).

We first claim that the set \(\overline{M}\) doesn’t contain any vertical lines.

For contradiction, assume that \(\overline{M}\) contains a vertical line.

The set \(\closure \overline{M}\) also contains a vertical line then.

Then \((\bzero, -1)\) is a direction of recession of \(\closure \overline{M}\).

Hence \((\bzero, -1)\) is also a direction of recession of \(\relint \overline{M}\).

Since \((\bzero, p^*)\) is a closure point of \(\overline{M}\), hence it is also a closure point of \(\relint \overline{M}\).

Hence, there exists a sequence \(\{ (\bx_k, t_k) \}\) of \(\relint \overline{M}\) converging to \((\bzero, p^*)\).

Since \((\bzero, -1)\) is a direction of recession of \(\relint \overline{M}\), hence the sequence \(\{ (\bx_k, t_k -1) \}\) belongs to \(\relint \overline{M}\).

Hence its limit \((\bzero, p^* - 1) \in \closure \overline{M}\).

By definition of \(\overline{M}\), for every \(k\), there exists a point \((\bx_k \overline{t_k}) \in M\) such that \(\overline{t_k} \leq t_k - 1\).

Hence there exists a sequence \(\{ (\bx_k, \overline{t_k} ) \}\) of \(M\) with \(\overline{t_k} \leq t_k - 1\) for all \(k\) so that \(\liminf_{k \to \infty} \overline{t_k} \leq p^* - 1\).

This contradicts the assumption that \(p^* \leq \liminf_{k \to \infty} t_k\) since \(\bx_k \to \bzero\).

We next show that the vector \((\bzero, p^* - \epsilon)\) does not belong to \(\closure \overline{M}\) for any \(\epsilon > 0\).

Assume for contradiction that for some \(\epsilon > 0\), the vector \((\bzero, p^* - \epsilon) \in \closure \overline{M}\).

Then there is a sequence \(\{ (\bx_k, t_k) \}\) of \(\overline{M}\) converging to \((\bzero, p^* - \epsilon)\).

By definition of \(\overline{M}\), for every \(k\), there exists a point \((\bx_k \overline{t_k}) \in M\) such that \(\overline{t_k} \leq t_k\).

Hence there exists a sequence \(\{ (\bx_k, \overline{t_k} ) \}\) of \(M\) with \(\overline{t_k} \leq t_k\) for all \(k\) so that \(\liminf_{k \to \infty} \overline{t_k} \leq p^* - \epsilon\).

This contradicts the assumption that \(p^* \leq \liminf_{k \to \infty} t_k\) since \(\bx_k \to \bzero\).

Since \(\overline{M}\) does not contain any vertical lines and the vector \((\bzero, p^* - \epsilon)\) doesn’t belong to \(\closure \overline{M}\), hence, due to Theorem 10.30, there exists a nonvertical hyperplane strongly separating \((\bzero, p^* - \epsilon)\) and \(\overline{M}\) for every \(\epsilon > 0\).

This hyperplane crosses the vertical axis at a unique vector \((\bzero, \xi)\) which must lie between \((\bzero, p^* - \epsilon)\) and \((\bzero, p^*)\); i.e., \(p^* - \epsilon \leq \xi \leq p^*\).

By definition of max crossing problem, \(\xi \leq q^*\).

Hence we have \(p^* - \epsilon \leq q^* \leq p^*\) for every \(\epsilon > 0\).

Since \(\epsilon\) can be arbitrarily small, it follows that \(p^* = q^*\).

Conversely, we assume that the strong duality holds.

Let \(\{ (\bx_k, t_k) \}\) be any sequence of \(M\) such that \(\bx_k \to \bzero\).

By definition of \(q(\ba)\),

\[ q(\ba) = \inf_{(\bx, t) \in M} \{ \langle \bx, \ba \rangle + t \} \leq \langle \bx_k, \ba \rangle + t_k \Forall k, \Forall \ba \in \VV. \]Taking the limit on R.H.S. to \(k \to \infty\), we obtain

\[ q(\ba) \leq \liminf_{k \to \infty} t_k \Forall \ba \in \VV. \]Taking the supremum on the L.H.S. over \(\ba \in \VV\), we have

\[ p^* = q^* = \sup_{\ba \in \VV} q(\ba) \leq \liminf_{k \to \infty} t_k. \]

This result doesn’t guarantee the attainment of either the min common optimal point or the max crossing optimal point. The next result includes additional conditions which ensures that the optimal point of max crossing problem is attained.

### 10.4.2.6. Second Min Common / Max Crossing Theorem#

(Min common/max crossing theorem II)

Consider the min common and max crossing problems. Assume the following:

\(-\infty < p^*\).

The set

\[ \overline{M} = \{ (\bx, t) \in \VV \oplus \RR \ST \text{ there exists } \bar{t} \in \RR \text{ with } \bar{t} \leq t \text{ and } (\bx, \bar{t}) \in M \} \]is convex.

The set

\[ D = \{ \bu \in \VV \ST \text{ there exists } t \in \RR \text{ with } (\bu, t) \in \overline{M} \} \]contains the origin in its relative interior.

Then \(q^* = p^*\) and the optimal solution set of the max crossing problem

has the form

where \(\tilde{Q}\) is a nonempty, convex and compact set and \((\affine D)^{\perp}\) is the orthogonal complement of \(\affine D\) (which is a subspace by assumption 3 since it contains the origin).

Furthermore, \(Q^*\) is nonempty and compact if and only if \(D\) contains the origin in its interior.

Note that \(D\) is the set of all possible horizontal coordinates of points in \(M\). \(\bx \in D\) if and only if there exists some \((\bx, t) \in \overline{M}\) if and only if there exists some \((\bx, s) \in M\) with \(s \leq t\). Since \(\overline{M}\) is convex and \(D\) is a projection of \(\overline{M}\) on \(\VV\), hence \(D\) is also convex.

Proof. We first show that the strong duality holds and \(Q^*\) is nonempty, closed and convex.

Condition (3) implies that there exists some \((\bzero, t) \in \overline{M}\).

By definition of \(\overline{M}\), there exists some \((\bzero, \overline{t}) \in M\) such that \(\overline{t} \leq t\).

Hence \(p^* < \infty\).

Then, by condition (1), \(p^* \in \RR\); i.e., the min crossing level is a real number.

The vertical axis \(\{ (\bzero, t) \ST t \in \RR \}\) belongs to \(\affine \overline{M}\).

\(p^*\) is the optimal min common value.

Hence \((\bzero, p^*)\) is not a relative interior point of \(\overline{M}\).

Accordingly, \((\bzero, p^*) \notin \relint \overline{M}\).

By Theorem 9.163, there exists a hyperplane that separates \(\overline{M}\) and the point \((\bzero, p^*)\) properly; i.e., it contains the point \((\bzero, p^*)\), contains \(\overline{M}\) in one of its half-spaces and doesn’t contain \(\overline{M}\) fully.

Hence, there exists a vector \((\ba, r)\) such that

\[ \langle \bx, \ba \rangle + t r \geq p^* r \Forall (\bx, t) \in \overline{M} \]and

\[ \sup_{(\bx, t) \in \overline{M}} \langle \bx, \ba \rangle + t r > p^* r. \]See Remark 9.10.

Since for every \((\overline{\bx}, \overline{t}) \in M\), the set \(\overline{M}\) contains the half-line \(\{ (\overline{\bx}, t) \ST \overline{t} \leq t \}\), it follows from the first inequality that \(r \geq 0\) must hold true.

\(r = 0\) leads to a contradiction.

Assume that \(r = 0\).

Then the first inequality reduces to

\[ \langle \bx, \ba \rangle \geq 0 \Forall \bx \in D. \]The linear functional \(\langle \bx, \ba \rangle\) attains its minimum over the set \(D\) at \(\bzero \in D\) which is a relative interior point of \(D\) by condition (3).

Since \(D\) is convex (projection of convex \(\overline{M}\) on \(\VV\)), and the linear functional \(\langle \bx, \ba \rangle\) (a concave function) attains its minimum at a relative interior point, hence, due to Theorem 10.5, the function must be constant over \(D\).

Hence we have

\[ \langle \bx, \ba \rangle = 0 \Forall \bx \in D. \]But this contradicts the second (strict) inequality of the proper separation result since \(\overline{M}\) cannot be contained entirely inside the hyperplane.

We arrive at a contradiction.

Hence \(r \neq 0\) and the separating hyperplane is nonvertical.

By appropriate normalization, if necessary, we can assume that \(r=1\).

The proper separation inequalities simplify to

\[ \langle \bx, \ba \rangle + t \geq p^* \Forall (\bx, t) \in \overline{M} \]and

\[ \sup_{(\bx, t) \in \overline{M}} \langle \bx, \ba \rangle + t > p^*. \]We now note that

\[\begin{split} p^* &\leq \inf_{(\bx, t) \in \overline{M}} \langle \bx, \ba \rangle + t \\ &\leq \inf_{(\bx, t) \in M} \langle \bx, \ba \rangle + t \\ &= q(\ba) \\ &\leq q^*. \end{split}\]Since by weak duality, we always have \(q^* \leq p^*\), hence all the inequalities must be equalities in the previous relation.

Thus we have

\[ q(\ba) = q^* = p^*. \]Hence \(Q^*\) is nonempty as \(\ba \in Q^*\).

We can also write \(Q^*\) as

\[ Q^* = \{ \ba \in \VV \ST q(\ba) \geq q^* \}. \]Since \(q\) is concave and upper semicontinuous, hence its superlevel sets are closed and convex.

Hence \(Q^*\) being a superlevel set of \(q\) is convex and closed.

We next show that \(Q^* = (\affine D)^{\perp} + \tilde{Q}\).

TBD

## 10.4.3. Minimax and Maximin Problems#

We consider functions of the form \(\phi : \VV \oplus \WW \to \RR\) with \(\dom \phi = X \times Z\). \(\VV\) and \(\WW\) are real vector spaces. \(X \subseteq \VV\) and \(Z \subseteq \WW\) are nonempty sets.

(Minimax problem)

A *minimax* problem takes the form

(Maximin problem)

A *maximin* problem takes the form

We next provide a few examples.

### 10.4.3.1. Zero Sum Games#

(Zero sum games)

We consider a two player game with the following design.

Player A can choose one out of \(n\) possible moves.

Player B can choose one out of \(m\) possible moves.

Both players make their moves simultaneously.

\(\bA \in \RR^{n \times m}\) is a payoff matrix.

If move \(i\) is selected by player A and move \(j\) is selected by player B, then A gives the specified amount \(a_{i j}\) to B.

Note that \(a_{i j} \in \RR\) can be zero, positive or negative.

The players use mixed strategy.

Player A selects a probability distribution \(\bx = (x_1, \dots, x_n)\) over her \(n\) possible moves.

Player B selects a probability distribution \(\bz = (z_1, \dots, z_m)\) over her \(m\) possible moves.

Since the probability of selecting move \(i\) by player A and move \(j\) by player B is \(x_i z_j\), hence the expected amount to be paid by A to B is

\[ \sum_{i j} x_i a_{i j} z_j = \bx^T \bA \bz. \]Each player adopts a worst case viewpoint, whereby she optimizes her choice against the worst possible selection by the other player.

Player A must minimize \(\max_{\bz} \bx^T \bA \bz\) so that she has to pay as low as possible to B.

Suppose A selects the strategy \(\bx\).

The amount she has to pay to B for B’s selection of strategy \(\bz\) is \(\bx^T \bA \bz\).

The maximum amount she has to pay across all possible strategies chosen by B is \(\max_{\bz} \bx^T \bA \bz\).

By selecting \(\bx\), her goal is to minimize the maximum payoff.

Player B must maximize \(\min_{\bx} \bx^T \bA \bz\).

Suppose B selects a strategy \(\bz\).

Then the payoff she gets by a strategy \(\bx\) of A is \(\bx^T \bA \bz\).

The minimum she can get for her choice of \(\bz\) is \(\min_{\bx} \bx^T \bA \bz\).

By selecting \(\bz\) her goal is to maximize the minimum payoff.

Here \(X = \Delta_n \subseteq \RR^n\), the unit simplex of \(\RR^n\).

Similarly, \(Z = \Delta_m \subseteq \RR^m\), the unit simplex of \(\RR^m\).

\(\phi(\bx, \bz) = \bx^T \bA \bz\).

The worst case pay off for player A is

\[ \inf_{\bx \in X} \sup_{\bz \in Z }\bx^T \bA \bz. \]The worst case pay off for player B is

\[ \sup_{\bz \in Z} \inf_{\bx \in X }\bx^T \bA \bz. \]Under suitable conditions, we can guarantee that

\[ \sup_{\bz \in Z} \inf_{\bx \in X }\bx^T \bA \bz = \inf_{\bx \in X} \sup_{\bz \in Z }\bx^T \bA \bz. \]

### 10.4.3.2. Lagrangian Functions#

(Lagrangian functions and duality theory)

Consider the optimization problem of the form

where \(f, g_1, \dots, g_m : \VV \to \RR\) are given objective and inequality constraint functions.

We construct the Lagrangian function

with \(\dom L = X \oplus Z\) where \(X = \dom f \cap \dom g_1 \cap \dots \cap \dom g_m\) and \(Z = \RR^m_+\) (the nonnegative orthant where \(\bz \succeq \bzero\)).

We construct the *primal problem* as below:

Choose any \(\bx \in X\).

If any of the constraints \(g_i(\bx) \leq 0\) is violated, then \(g_i(\bx) > 0\) and hence \(\sup_{\bz \succeq \bzero } L (\bx, \bz ) = \infty\). We can easily achieve this by taking \(z_i \to \infty\).

If none of the constraints are invalidated, then by picking \(\bz = \bzero\), we have \(\sup_{\bz \succeq \bzero } L (\bx, \bz ) = f(\bx)\).

Hence the problem is equivalent to the original problem.

We can now construct the *dual problem* as below:

Thus the primal problem is a minimax problem and the dual problem is a maximin problem with the Lagrangian playing the role of \(\phi\).

Under suitable conditions, the two problems have equal optimal value.

### 10.4.3.3. Minimax Equality#

In the following, we will explore conditions under which

and the infimum and the supremum are attained.

The key result is the *saddle point theorem*
which guarantees the equality in (10.4)
as well as the attainment of the infimum/supremum assuming
convexity/concavity on \(\phi\)
and compactness on \(X\) and \(Z\).

The compactness assumptions are restrictive. For example, in the duality theory, the Lagrange multipliers \(\bz \succeq \bzero\) belong to the nonnegative orthant which is not compact.

The *minimax theorem* gives conditions
guaranteeing the minimax equality (10.4),
although it need not guarantee the attainment of the
infimum and the supremum.

### 10.4.3.4. Minimax Inequality#

(Minimax inequality)

We always have

In other words, the optimal value of the maximin problem is less than or equal to the optimum value of the minimax problem.

Proof. We note that for every \(\bz \in Z\), we have

Taking the supremum on the L.H.S., we obtain the desired inequality.

Thus, in order to show (10.4), it is sufficient to show that the reverse inequality holds.

### 10.4.3.5. Saddle Points#

(Saddle point)

A pair of vectors \(\bx^* \in X\) and \(\bz^* \in Z\) is called a saddle point of \(\phi\) if

(Saddle point inequalities)

The pair \((\bx^*, \bz^*) \in \VV \oplus \WW\) is a saddle point if and only if \(\bx^* \in X\), \(\bz^* \in Z\) and

This equality further implies that

In short

Combined with the minimax inequality (10.5), it shows that if a saddle point exists, then the minimax equality (10.4) holds.

(Saddle point = minimax equality)

A pair \((\bx^*, \bz^*)\) is a saddle point of \(\phi\) if and only if the minimax equality (10.4) holds and \(\bx^*\) is an optimal solution of the minimax problem

while \(\bz^*\) is the optimal solution of the maximin problem

Proof. Suppose that \(\bx^*\) is the optimal solution of the minimax problem and \(\bz^*\) is the optimal solution of the maximin problem.

From the minimax problem we obtain

\[ \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) = \sup_{\bz \in Z} \phi(\bx^*, \bz) \geq \phi(\bx^*, \bz^*). \]From the maximin problem we obtain

\[ \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz) = \inf_{\bx \in X} \phi(\bx, \bz^*) \leq \phi(\bx^*, \bz^*). \]Combining these inequalities, we have

\[\begin{split} & \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz) = \inf_{\bx \in X} \phi(\bx, \bz^*) \\ &\leq \phi(\bx^*, \bz^*) \\ &\leq \sup_{\bz \in Z} \phi(\bx^*, \bz) = \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz). \end{split}\]If the minimax equality (10.4) holds, then equality holds throughout above giving us

\[ \inf_{\bx \in X} \phi(\bx, \bz^*) = \phi(\bx^*, \bz^*) = \sup_{\bz \in Z} \phi(\bx^*, \bz). \]Hence \((\bx^*, \bz^*)\) is a saddle point of \(\phi\) as per Remark 10.7.

Conversely, assume that \((\bx^*, \bz^*)\) is a saddle point of \(\phi\).

From Remark 10.7, we have

\[\begin{split} &\inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) \\ & \leq \sup_{\bz \in Z} \phi(\bx^*, \bz) = \phi(\bx^*, \bz^*) = \inf_{\bx \in X} \phi(\bx, \bz^*) \\ & \leq \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz). \end{split}\]The minimax inequality (10.5) gives us

\[ \sup_{\bz \in Z} \inf_{\bx \in X } \phi(\bx, \bz) \leq \inf_{\bx \in X} \sup_{\bz \in Z } \phi(\bx, \bz). \]Thus, all the inequalities in the previous relationship must be equalities. We have

\[\begin{split} &\inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) \\ & = \sup_{\bz \in Z} \phi(\bx^*, \bz) = \phi(\bx^*, \bz^*) = \inf_{\bx \in X} \phi(\bx, \bz^*) \\ & = \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz). \end{split}\]Hence the minimax equality holds.

It also implies that \(\bx^*\) is an optimal solution of the minimax problem and \(\bz^*\) is the optimal solution of the maximin problem.

(Saddle points as a Cartesian product)

When the set of saddle points is nonempty, it can be written as a Cartesian product \(X^* \times Z^*\) where \(X^*\) is the set of optimal solutions of the minimax problem and the \(Z^*\) is the set of optimal solutions of the maximin problem.

In other words, \(\bx^*\) and \(\bz^*\) can be chosen independently from the sets \(X^*\) and \(Z^*\) respectively to form a saddle point. This is a direct consequence of Theorem 10.35.

If the minimax equality (10.4) does not hold, then there is no saddle point even if the minimax and maximin problems have optimal solutions.

## 10.4.4. Min Common/Max Crossing Framework for Minimax#

Recall that the minimax problem is given by

We add a linear perturbation to \(\phi\) and introduce a function \(\psi : \WW \to \ERL\) as

We can see that

The linear perturbation term impacts the optimum value of the minimax problem. We shall show that if \(\psi\) changes in a “regular” manner, then the minimax equality is guaranteed.

### 10.4.4.1. Framework Definition#

(Min common / max crossing framework for minimax problem)

We define the set \(M\) required for the min common/max crossing framework as

where \(\psi : \WW \to \ERL\) is given by

Recall that min common value is given by

\[ p^* = \inf_{(\bzero, p) \in M} p. \]Note that for \(\bu = \bzero\), \(M\) contains all the points \((\bzero, t)\) such that \(\psi(\bzero) \leq t\).

In particular, \((\bzero, \psi(\bzero)) \in M\) and if \((\bzero, t) \in M\) then \(\psi(\bzero) \leq t\).

Hence \(p^*\) is given by

\[ p^* = \psi(\bzero) = \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz). \]Since \(M\) is an epigraph, hence the sets \(M\) and \(\overline{M}\) are identical in the min common/max crossing framework.

If \(\psi\) is convex, then \(M = \epi \psi\) is convex as desired in several results of the min common/max crossing framework.

The corresponding max crossing problem is given by:

We note that

(10.7)#\[\begin{split}q(\ba) &= \inf_{(\bu, t) \in \epi \psi} \{ \langle \bu, \ba \rangle + t \}\\ &= \inf_{(\bu, t) \in \psi(\bu) \leq t} \{ \langle \bu, \ba \rangle + t \} \\ &= \inf_{\bu \in \WW} \{\psi(\bu) + \langle \bu, \ba \rangle \}.\end{split}\]Its optimal value is denoted by \(q^*\); i.e.,

\[ q^* = \sup_{\ba \in \WW} q(\ba). \]

### 10.4.4.2. Connection#

(Connection between minimax equality and min common/max crossing framework)

By putting the definition of \(\psi(\bu)\) in the expression for \(q(\ba)\), we obtain

\[\begin{split} q(\ba) &= \inf_{\bu \in \WW} \{\psi(\bu) + \langle \bu, \ba \rangle \} \\ &= \inf_{\bu \in \WW} \{ \inf_{\bx \in X} \sup_{\bz \in Z} \{ \phi(\bx, \bz) - \langle \bu, \bz \rangle \} + \langle \bu, \ba \rangle \} \\ &= \inf_{\bu \in \WW} \inf_{\bx \in X} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \}. \end{split}\]In particular, for every \(\ba \in Z\), if we set \(\bz = \ba\) in this relation, we can see that

\[ \inf_{\bx \in X} \phi(\bx, \ba) \leq q(\ba) \Forall \ba \in Z. \]From the weak duality principle, we have \(q^* \leq p^*\).

We have established that \(p^* = \psi(\bzero)\).

We can now see that

\[\begin{split} \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz) &= \sup_{\ba \in Z} \inf_{\bx \in X} \phi(\bx, \ba)\\ & \leq \sup_{\ba \in Z} q(\ba) \\ & \leq \sup_{\ba \in \WW} q(\ba) \\ &= q^* \\ &\leq p^* \\ &= \psi(\bzero) \\ &= \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz). \end{split}\]This is nothing but the minimax inequality (10.5).

We can see that if the minimax equality (10.4) holds, then all inequalities in the previous relation turn into equalities and we have \(q^* = p^*\).

In other words, if the minimax equality holds, then the optimal values of the min common and max crossing problems are equal.

### 10.4.4.3. Convexity of \(\psi\)#

\(\phi\) w.r.t. \(\bx\) and convexity of \(\psi\))

(Convexity ofLet \(X\) be a nonempty convex subset of \(\VV\) and \(Z\) be a nonempty subset of \(\WW\). Let \(\phi: \VV \oplus \WW \to \RR\) be a function with \(\dom \phi = X \times Z\). Assume that for each \(\bz \in Z\), the function \(\phi(\cdot, \bz) : \VV \to \RR\) is convex. Then the function \(\psi\) as defined in (10.6) is convex.

Proof. Recall that

Fix some \(\bz \in Z\).

Consider the function \(f_z(\bx, \bu) = \phi(\bx, \bz) - \langle \bu, \bz \rangle\).

Clearly, \(f_z\) is convex for each \(\bz \in Z\) by hypothesis.

Taking the pointwise supremum over \(\bz \in Z\), the function

\[\begin{split} F (\bx, \bu) = \begin{cases} \sup_{\bz \in Z} \{ \phi(\bx, \bz) - \langle \bu, \bz \rangle \} & \bx \in X;\\ \infty & \bx \notin X. \end{cases} \end{split}\]is also convex (over \(\bx\) and \(\bu\)).

We now have

\[ \psi(\bu) = \inf_{\bx \in \VV} F (\bx, \bu). \]Since partial minimization preserves convexity, hence \(\psi\) is convex.

### 10.4.4.4. Minimax Equality Strong Duality Equivalence Conditions#

\(-\phi\) w.r.t. \(\bz\) and minimax equality)

(Closedness and convexity ofLet \(X\) be a nonempty convex subset of \(\VV\) and \(Z\) be a nonempty subset of \(\WW\). Let \(\phi: \VV \oplus \WW \to \RR\) be a function with \(\dom \phi = X \times Z\). Assume that for each \(\bx \in X\), the function \(-\phi(\bx, \cdot) \to \RR\) is closed and convex. Then the function \(q : \WW \to \ERL\) given by

where \(\psi\) as defined in (10.6), satisfies

Furthermore, we have \(q^* = p^*\) if and only if the minimax equality (10.4) holds.

Proof. We shall show the following one by one.

\(q(\ba) \geq \inf_{\bx \in X} \phi(\bx, \ba) \Forall \ba \in Z\).

\(q(\ba) \leq \inf_{\bx \in X} \phi(\bx, \ba) \Forall \ba \in Z\).

\(q(\ba) = -\infty \Forall \ba \notin Z\).

(1)

We have already established in (10.7) that

\[ q(\ba) = \inf_{\bu \in \WW} \{\psi(\bu) + \langle \bu, \ba \rangle \}. \]By using the definition of \(\psi\), we further established that

\[ q(\ba) = \inf_{\bu \in \WW} \inf_{\bx \in X} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \}. \]By rearranging the order of infimum operations, we have

\[ q(\ba) = \inf_{\bx \in X} \inf_{\bu \in \WW} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \}. \]For any \(\ba \in Z\) we have

\[\begin{split} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \} &\geq \phi(\bx, \ba) + \langle \ba - \ba, \bu \rangle \\ &= \phi(\bx, \ba) \quad \Forall \bx \in X, \Forall \bu \in \WW. \end{split}\]This in turn implies that

\[ q(\ba) \geq \inf_{\bx \in X} \phi(\bx, \ba) \Forall \ba \in Z. \]

(2)

Let \(r_x : \WW \to \RR\) be given by

\[ r_x(\bz) = -\phi(\bx, \bz). \]By hypothesis \(r_x\) is closed and convex.

Let \(\ba \in Z\).

Fix some \(\bx \in X\).

Since \(r_x\) is a closed and convex function, hence \(\epi r_x\) is a closed and convex set.

Since \(\ba \in Z\), hence the point \((\ba, r_x(\ba)) \in \epi r_x\).

For some \(\epsilon > 0\), consider the point \((\ba, r_x(\ba) - \epsilon)\).

Clearly \((\ba, r_x(\ba) - \epsilon) \notin \epi r_x\).

By definition of \(r_x\), \(r_x(\bz)\) is finite for all \(\bz \in Z\), \(Z\) is nonempty and \(\epi r_x\) is closed.

Since \(r_x(\bz)\) is finite for all \(\bz \in Z\), the epigraph doesn’t contain any vertical lines.

Hence, due to Theorem 10.30, there exists a nonvertical hyperplane that strongly separates the point \((\ba, r_x(\ba) - \epsilon)\) from \(\epi r_x\).

Hence there exists a normal vector \((\bu, 1)\) and a scalar \(c\) such that

\[ \langle \ba, \bu \rangle + (r_x(\ba) - \epsilon) < c < \langle \bu, \bz \rangle + r_x(\bz) \Forall \bz \in Z. \]Substituting \(r_x(\bz) = -\phi(\bx, \bz)\), we get

\[ \langle \ba, \bu \rangle + (-\phi(\bx, \ba) - \epsilon) < \langle \bu, \bz \rangle - \phi(\bx, \bz). \]Rearranging, we get

\[ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle < \phi(\bx, \ba) + \epsilon \Forall \bz \in Z. \]Letting \(\epsilon \downarrow 0\), we have

\[ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \leq \phi(\bx, \ba) \Forall \bz \in Z. \]We further note that

\[ \inf_{\bu \in \WW} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \} \leq \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \} \leq \phi(\bx, \ba). \]Taking infimum over \(\bx \in X\) on the L.H.S., we obtain

\[ q(\ba) = \inf_{\bx \in X} \inf_{\bu \in \WW} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \} \leq \inf_{\bx \in X} \phi(\bx, \ba) \]as desired.

(3)

Take any \(\ba \notin Z\).

Fix an arbitrary \(\bx \in X\).

Consider a sequence \(\{ t_k \}\) such that \(t_k \to \infty\).

Since \(\ba \notin Z\), hence \((\ba, t_k) \notin \epi r_x\) for every \(k\).

Again applying Theorem 10.30, for each \(k\), there exists a nonvertical hyperplane strongly separating \((\ba, t_k)\) from \(\epi r_x\).

Hence for each \(k\), there exists a normal vector \((\bu_k , 1)\) such that

\[ \langle \ba, \bu_k \rangle + t_k <\langle \bz, \bu_k \rangle - \phi(\bx, \bz) \Forall \bz \in Z. \]Rearranging, we have

\[ \phi(\bx, \bz) + \langle \ba - \bz, \bu_k \rangle < -t_k, \quad \Forall \bz \in Z, \Forall k. \]Thus, we have

\[ \inf_{\bu \in \WW} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \} \leq \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu_k \rangle \} \leq -t_k \quad \Forall k. \]Taking the limit on the R.H.S. as \(k \to \infty\), we can see that

\[ \inf_{\bu \in \WW} \sup_{\bz \in Z} \{ \phi(\bx, \bz) + \langle \ba - \bz, \bu \rangle \} = -\infty \quad \Forall \bx \in X. \]Finally, taking the infimum over \(\bx \in X\), we can see that

\[ q(\ba) = -\infty. \]

By Observation 10.7, if minimax equality holds, then \(q^* = p^*\). For the converse, we now assume that \(q^* = p^*\). Then

Thus the minimax equality holds.

## 10.4.5. Minimax Theorems#

### 10.4.5.1. First Theorem#

(Minimax theorem I)

Let \(X\) be a nonempty convex subset of \(\VV\) and \(Z\) be a nonempty subset of \(\WW\). Let \(\phi: \VV \oplus \WW \to \RR\) be a function with \(\dom \phi = X \times Z\). Assume that for each \(\bz \in Z\), the function \(\phi (\cdot, \bz) : \VV \to \RR\) is convex, and for each \(\bx \in X\), the function \(-\phi(\bx, \cdot) : \WW \to \RR\) is closed and convex. Assume further that

Then, the minimax equality (10.4) holds; i.e.,

if and only if the function \(\psi\) as defined in (10.6) is lower semicontinuous at \(\bu = \bzero\); i.e.,

for every sequence \(\{ \bu_k \}\) with \(\bu_k \to \bzero\).

Proof. The proof consists of establishing the correspondence between the conditions in this result and the conditions in the first min common/max crossing theorem (Theorem 10.33).

We choose the set \(M\) as described in Definition 10.19, to be the epigraph of the function \(\psi\).

\[ M = \overline{M} = \epi \psi = \{ (\bu, t) \in \WW \oplus \RR \ST \psi(\bu) \leq t \}. \]We have shown in Definition 10.19 that \(p^* = \psi(\bzero) = \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz)\).

By hypothesis, \(p^* < \infty\).

Hence the first assumption of Theorem 10.33 is satisfied.

Following Lemma 10.2, \(\psi\) is convex.

Hence \(M = \epi \psi\) is also convex.

Hence the second assumption of Theorem 10.33 is satisfied.

Finally, the condition

\[ \psi(\bzero) \leq \liminf_{k \to \infty} \psi(\bu_k) \]is equivalent to the condition of Theorem 10.33 that for every sequence \(\{ (\bu_k, t_k) \}\) of \(M\) with \(\bu_k \to \bzero\).

\[ p^* \leq \liminf_{k \to \infty} t_k \]holds true.

We have \(t_k \geq \psi(\bu_k)\).

Hence

\[ \liminf_{k \to \infty} t_k \geq \liminf_{k \to \infty} \psi(\bu_k) \geq \psi(\bzero) = p^*. \]

Following Theorem 10.33, this condition holds if and only if \(p^* = q^*\).

Since \(-\phi(\bx, \cdot)\) is closed and convex, hence following Lemma 10.3, this condition holds if and only if minimax equality holds.

### 10.4.5.2. Second Theorem#

We can adapt the argument of first minimax theorem to include conditions on the lines of the second min common/max crossing theorem Theorem 10.34.

(Minimax theorem II)

Let \(X\) be a nonempty convex subset of \(\VV\) and \(Z\) be a nonempty subset of \(\WW\). Let \(\phi: \VV \oplus \WW \to \RR\) be a function with \(\dom \phi = X \times Z\). Assume that for each \(\bz \in Z\), the function \(\phi (\cdot, \bz) : \VV \to \RR\) is convex, and for each \(\bx \in X\), the function \(-\phi(\bx, \cdot) : \WW \to \RR\) is closed and convex. Assume further that

and that \(\bzero\) lies in the relative interior of the effective domain of the function \(\psi\) as defined in (10.6). Then, the minimax equality (10.4) holds; i.e.,

and the supremum over \(Z\) in the left hand side is finite and is attained. Furthermore, the set of \(\bz \in Z\) attaining this supremum is compact if and only if \(\bzero\) lies in the interior of the effective domain of \(\psi\).

## 10.4.6. Saddle Point Theorems#

From the first minimax theorem (Theorem 10.36), we can see that minimax equality is satisfied if and only if the function \(\psi\) is lower semicontinuous at \(\bu = \bzero\).

If \(\psi\) is closed, then it will be lower semicontinuous.

The proof of Lemma 10.2 shows that \(\psi\) can be written as a partial minimization of \(F\)

\[ \psi(\bu) = \inf_{\bx \in \VV} F (\bx, \bu). \]where \(F: \VV \oplus \WW \to \RERL\) is given by

\[\begin{split} F (\bx, \bu) = \begin{cases} \sup_{\bz \in Z} \{ \phi(\bx, \bz) - \langle \bu, \bz \rangle \} & \bx \in X;\\ \infty & \bx \notin X. \end{cases} \end{split}\]The results in Partial Minimization and Closedness can be used to guarantee the closedness of \(\psi\).

These results can also guarantee that the infimum of \(F\) over \(\bx\) is attained.

In particular \(\psi(\bzero)\) will be finite and hence guaranteeing that the optimal value of the minimax problem is finite.

(Auxiliary functions for the minimax problem)

Let \(X\) be a nonempty convex subset of \(\VV\) and \(Z\) be a nonempty subset of \(\WW\). Let \(\phi: \VV \oplus \WW \to \RR\) be a function with \(\dom \phi = X \times Z\).

For each \(\bz \in Z\), the function \(\eta_z : \VV \to \RERL\) is defined as

For each \(\bx \in X\), the function \(\rho_x : \WW \to \RERL\) is defined as

The function \(\eta: \VV \to \RERL\) is defined as

The function \(\rho: \WW \to \RERL\) is defined as

Following remarks are in order.

If \(\eta_z\) is closed and convex for each \(\bz \in Z\), then \(\eta\) is closed and convex due to Theorem 9.114.

If \(\rho_x\) is closed and convex for each \(\bx \in X\), then \(\rho\) is closed and convex due to Theorem 9.114.

By definition, \(\eta(\bx) > -\infty\) for every \(\bx \in X\).

\(\eta\) can be proper only if \(\eta(\bx) < \infty\) for some \(\bx \in X\).

Equivalently, for the optimal minimax value

\[ \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty \]must hold for \(\eta\) to be proper.

Similarly, \(\rho (\bz) > -\infty\) for every \(\bz \in Z\).

\(\rho\) can be proper only if \(\rho(\bz) < \infty\) for some \(\bz \in Z\).

This is possible only if

\[ \inf_{\bz \in Z} \sup_{\bx \in X} (- \phi (\bx, \bz)) < \infty. \]Equivalently, for the optimal maximin value

\[ -\infty < \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz) \]must hold true.

The set of minimizers of \(\eta\) are the optimal points for the minimax problem: \(X^*\).

The set of minimizers of \(\rho\) are the optimal points of the maximin problem: \(Z^*\).

### 10.4.6.1. Minimax Equality and Attainment of Minimax Solution#

In the following results, we provide conditions under which the minimax equality is attained and the the set of optimal solutions for the minimax problem is nonempty.

\(\eta\))

(Compact sublevel sets ofAssume the following:

\(\eta_z\) is closed and convex for every \(\bz \in Z\).

\(\rho_x\) is closed and convex for every \(\bx \in X\).

The optimal minimax value is less than infinity

\[ \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty. \]The sublevel sets of \(\eta\) are compact.

Then the minimax equality (10.4) holds and the set of optimal points for the minimax problem \(X^*\) is nonempty and compact.

Proof. .

Under the assumptions above, the function \(\eta\) is proper, closed and convex.

By definition \(F(\bx, \bu) > -\infty\) for every \(\bx, \bu\).

Note that \(F(\bx, \bzero) = \eta(\bx)\).

Since \(\eta\) is proper, hence there exists \(\bx \in X\) such that \(F(\bx, \bzero) < \infty\).

Hence \(F\) is also proper.

\(F\) is also closed and convex.

Fix some \(\bz \in Z\).

Consider the function \(f_z(\bx, \bu) = \eta_z(\bx) - \langle \bu, \bz \rangle\).

\(\eta_z\) is closed and convex for every \(\bz\) by hypothesis.

Hence \(f_z\) is closed and convex for every \(\bz\).

Taking the pointwise supremum over \(\bz \in Z\), \(F\) is closed and convex.

The sets

\[ \{ \bx \ST F(\bx, \bzero) \leq t \} = \{ \bx \ST \eta(\bx) \leq t \} \]are the sublevel sets of \(\eta\) which are compact for every \(t\) by hypothesis.

We can easily select a scalar for which the sublevel set of \(\eta\) is also nonempty since \(\eta\) is proper.

Hence, due to Theorem 9.120, the function \(\psi\) which is a partial minimization of \(F\) over \(\bx\) is proper, closed and convex.

Since \(\psi\) is closed, hence it is lower semicontinuous.

In particular, \(\psi\) is l.s.c. at \(\bu = \bzero\).

Hence due to Theorem 10.36, the minimax equality holds.

Since \(X^*\) is the set of minimizers of \(\eta\), \(\eta\) is proper and closed, and the sublevel sets of \(\eta\) are compact, hence \(X^*\) is nonempty and compact due to Weierstrass’ theorem (Theorem 8.9).

\(\eta\))

(Recession and constancy space ofAssume the following:

\(\eta_z\) is closed and convex for every \(\bz \in Z\).

\(\rho_x\) is closed and convex for every \(\bx \in X\).

The optimal minimax value is less than infinity

\[ \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty. \]The recession cone and constancy space of the function \(\eta\) are equal.

Then the minimax equality (10.4) holds and the set of optimal points for the minimax problem \(X^*\) is nonempty.

\(F\) domain as a set of linear inequalities)

(Assume the following:

\(\eta_z\) is closed and convex for every \(\bz \in Z\).

\(\rho_x\) is closed and convex for every \(\bx \in X\).

The optimal minimax value is less than infinity

\[ \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty. \]The function

\[\begin{split} F (\bx, \bu) = \begin{cases} \sup_{\bz \in Z} \{ \phi(\bx, \bz) - \langle \bu, \bz \rangle \} & \bx \in X;\\ \infty & \bx \notin X \end{cases} \end{split}\]has the form

\[\begin{split} F (\bx, \bu) = \begin{cases} \overline{F}(\bx, \bu) & (\bx, \bu) \in C;\\ \infty & (\bx, \bu) \notin C \end{cases} \end{split}\]where \(\overline{F}\) is a proper, closed, and convex function on \(\VV \oplus \WW\) and \(C\) is specified by linear inequalities; i.e,

\[ C = \{ (\bx, \bu) \ST \bA \bx + \bB \bu \preceq \bb \}, \]where \(\bA, \bB\) are matrices and \(\bb\) is a vector.

Every common direction of recession of \(C\) and \(\overline{F}\) is a direction along which \(\overline{F}\) is constant.

Then the minimax equality (10.4) holds and the set of optimal points for the minimax problem \(X^*\) is nonempty.

\(\phi\))

(Quadratic formAssume the following:

\(\VV = \RR^n\) and \(\WW = \RR^m\).

The function \(\phi\) has a quadratic form

\[ \phi(\bx, \bz) = \bx^T \bQ \bx + \bc^T \bx + \bz^T \bM \bx - \bz^T \bR \bz - \bd^T \bz \]where \(\bQ\) and \(\bR\) are symmetric matrices, \(\bM\) is a matrix, and \(c\) and \(d\) are vectors.

\(Z = \RR^m\).

\(X\) is a set of the form

\[ X = \{ \bx \ST \bx^T \bQ_j \bx + \ba^T_j \bx + b_j \leq 0, j=1, \dots, r \} \]where \(\bQ_j\) are symmetric positive semidefinite matrices, \(\ba_j\) are vectors and \(b_j\) are scalars.

\(\eta_z\) is closed and convex for every \(\bz \in Z\).

\(\rho_x\) is closed and convex for every \(\bx \in X\).

The optimal minimax value satisfies

\[ -\infty < \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty. \]

Then the minimax equality (10.4) holds and the set of optimal points for the minimax problem \(X^*\) is nonempty.

### 10.4.6.2. Existence of Saddle Points#

\(\eta\) and \(\rho\))

(Compact sublevel sets ofAssume the following:

\(\eta_z\) is closed and convex for every \(\bz \in Z\).

\(\rho_x\) is closed and convex for every \(\bx \in X\).

Assume that either

\[ \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty \]or

\[ -\infty < \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz) \]holds true.

The sublevel sets of \(\eta\) and \(\rho\) are compact.

Then the set of saddle points of \(\phi\) is nonempty and compact.

Proof. First assume that \(\inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty\).

By applying Theorem 10.38, minimax equality holds and \(X^*\) is nonempty.

Hence \(\inf_{\bx \in X} \sup_{\bz \in Z } \phi(\bx, \bz)\) is finite.

Due to minimax equality:

\[ -\infty < \sup_{\bz \in Z} \inf_{\bx \in X } \phi(\bx, \bz) = \inf_{\bx \in X} \sup_{\bz \in Z } \phi(\bx, \bz) < \infty. \]We reverse the roles of \(\bx\) and \(\bz\) and the sign of \(\phi\) and apply Theorem 10.38 again to show that \(Z^*\) is nonempty and compact set.

The set of saddle points is a Cartesian product of \(X^*\) and \(Z^*\).

Since both \(X^*\) and \(Z^*\) are nonempty and compact, hence \(X^* \times Z^*\) is also nonempty and compact.

Now assume that \(-\infty < \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz)\).

Then \(\inf_{\bz \in Z} \sup_{\bx \in X} (-\phi(\bx, \bz)) < \infty\).

we can then reverse the role of \(\bx\) and \(\bz\) in the preceding argument.

\(\eta\) and \(\rho\))

(Recession cones and constancy spaces ofAssume the following:

\(\eta_z\) is closed and convex for every \(\bz \in Z\).

\(\rho_x\) is closed and convex for every \(\bx \in X\).

Assume that either

\[ \inf_{\bx \in X} \sup_{\bz \in Z} \phi(\bx, \bz) < \infty \]or

\[ -\infty < \sup_{\bz \in Z} \inf_{\bx \in X} \phi(\bx, \bz) \]holds true.

The recession cones and constancy spaces of \(\eta\) and \(\rho\) are equal to each other:

\[ R_{\eta} = L_{\eta} \text{ and } R_{\rho} = L_{\rho}. \]

Then the set of saddle points of \(\phi\) is nonempty.

### 10.4.6.3. Saddle Point Theorem#

The analysis in this section culminates into the following result.

(Saddle point theorem)

Assume that \(\eta_z\) is closed and convex for every \(\bz \in Z\) and \(\rho_x\) is closed and convex for every \(\bx \in X\). Then the set of saddle points of \(\phi\) is nonempty and compact under any of the following conditions.

\(X\) and \(Z\) are compact.

\(Z\) is compact and there exists a vector \(\overline{\bz} \in Z\) and a scalar \(c\) such that the sublevel set

\[ \{ \bx \in X \ST \phi(\bx, \overline{\bz}) \leq c \} \]is nonempty and compact.

\(X\) is compact and there exists a vector \(\overline{\bx} \in X\) and a scalar \(c\) such that the superlevel set

\[ \{ \bz \in Z \ST \phi(\overline{\bx}, \bz) \geq c \} \]is nonempty and compact.

There exist vectors \(\overline{\bx} \in X\) and \(\overline{\bz} \in Z\) and a scalar \(c\) such that the sets

\[ \{ \bx \in X \ST \phi(\bx, \overline{\bz}) \leq c \} \text{ and } \{ \bz \in Z \ST \phi(\overline{\bx}, \bz) \geq c \} \]are nonempty and compact.