9.2. Convex Sets#

Throughout this section, we assume that \(\VV\) is a real vector space. Wherever necessary, it is equipped with a norm \(\| \cdot \| : \VV \to \RR\) or a real inner product \(\langle \cdot, \cdot \rangle : \VV \times \VV \to \RR\). It is also equipped with a metric \(d(\bx, \by) = \| \bx - \by \|\).

9.2.1. Line Segments#

Definition 9.5 (Line segment)

Let \(\bx_1\) and \(\bx_2\) be two points in \(\VV\). Points of the form

\[ \by = (1 - \theta) \bx_1 + \theta \bx_2 \text{ where } 0 \leq \theta \leq 1 \]

form a (closed) line-segment between \(\bx_1\) and \(\bx_2\). The closed line segment is denoted by \([\bx_1, \bx_2]\).

\[ [\bx_1, \bx_2] \triangleq \{ (1 - \theta) \bx_1 + \theta \bx_2 \ST 0 \leq \theta \leq 1 \}. \]

Similarly, we define an open line segment as:

\[ (\bx_1, \bx_2) \triangleq \{ (1 - \theta) \bx_1 + \theta \bx_2 \ST 0 < \theta < 1 \}. \]

The half-open segment \((\bx_1, \bx_2]\) is defined as:

\[ (\bx_1, \bx_2] \triangleq \{ (1 - \theta) \bx_1 + \theta \bx_2 \ST 0 < \theta \leq 1 \}. \]

The half-open segment \([\bx_1, \bx_2)\) is defined as:

\[ [\bx_1, \bx_2) \triangleq \{ (1 - \theta) \bx_1 + \theta \bx_2 \ST 0 \leq \theta < 1 \}. \]

9.2.2. Convex Sets#

Definition 9.6 (Convex set)

A set \(C \subseteq \VV\) is convex if the line segment between any two points in \(C\) lies in \(C\). i.e.

\[ \theta \bx_1 + (1 - \theta) \bx_2 \in C \Forall \bx_1, \bx_2 \in C \text{ and } 0 \leq \theta \leq 1. \]

The empty set is vacuously convex. The entire vector space \(\VV\) is convex since it contains all the line segments between any pair of points in the space.

../_images/pic_convex_sets.png

Fig. 9.1 Examples of convex sets#

Observation 9.2

Since a convex set contains the line segment between any two points, any line segment is convex by definition.

Geometrically, a convex set has no holes, hollows, pits or dimples. A set is convex if from each point in the set, it is possible to see every other point without having the line of sight pass outside the set.

Example 9.1 (Real line)

On the real line \(\RR\), the empty set, sing points, intervals (closed, open, half open), half lines, and the entire real line are convex sets. There are no other convex sets possible.

Theorem 9.7

Any linear subspace is convex.

Proof. Let \(\EE\) be a linear subspace of \(\VV\). Then \(\EE\) is closed under addition and scalar multiplication. Thus, for any \(\bx, \by \in \EE\) and \(0 \leq t \leq 1\),

\[ t \bx + (1-t)\by \in \EE. \]

Thus, \(\EE\) is convex.

Theorem 9.8

Any affine set is convex.

Proof. Let \(C \subseteq \VV\) be an affine set. By definition, for any \(\bx, \by \in C\) and any \(t \in \RR\), \(t \bx + (1-t)\by \in C\). It is valid in particular for \(0 \leq t \leq 1\). Thus, \(C\) is convex.

Theorem 9.9

Any hyperplane is convex since it is affine.

Theorem 9.10

Half spaces are convex.

Proof. Consider \(H_+\) defined as:

\[ H_+ = \{ x : \langle \ba, \bx \rangle \geq b \} \]

Let \(\bx, \by \in H_+\). Then:

\[ \langle \ba, \bx \rangle \geq b \text{ and } \langle \ba, \by \rangle \geq b. \]

For some \(0 \leq t \leq 1\):

\[ \langle \ba, t \bx + (1 - t) \by \rangle = t \langle \ba, \bx \rangle + (1 - t)\langle \ba, \by \rangle \geq t b + (1 -t )b = b. \]

Thus, \(\bx + (1 - t) \by \in H_+\). Analogous proofs apply for other types of half spaces.

Theorem 9.11 (Convex set as convex combination of itself)

Let \(C\) be a nonempty subset of \(\VV\). If \(C\) is convex then for every \(t_1, t_2 \geq 0\), we have

\[ (t_1 + t_2) C = t_1 C + t_2 C. \]

In particular, if \(t_1, t_2 \geq 0\) such that \(t_1 + t_2 = 1\), then

\[ C = t_1 C + t_2 C. \]

Proof. The statement \((t_1 + t_2) C \subseteq t_1 C + t_2 C\) is valid even for sets which are not convex.

  1. Let \(\bx \in t_1 + t_2) C\).

  2. Then there exist \(\by \in C\) such that \(\bx = t_1 \by + t_2 \by\).

  3. Hence \(t_1 \by \in t_1 C\) and \(t_2 \by \in t_2 C\).

  4. Hence \(\bx = t_1 \by + t_2 \by \in t_1 C + t_2 C\).

We now show the converse.

  1. Let \(\bx \in t_1 C + t_2 C\).

  2. Then there exist \(\bx_1, \bx_2 \in C\) such that

    \[ \bx = t_1 \bx_1 + t_2 \bx_2. \]
  3. If \(t_1 = t_2 = 0\), then \(\bx = 0\) and \(\bx \in (0 + 0) C\).

  4. Now assume that \(t_1 + t_2 > 0\).

  5. By the convexity of \(C\),

    \[ \by = \frac{t_1}{t_1 + t_2} \bx_1 + \frac{t_2}{t_1 + t_2} \bx_2 \in C. \]
  6. Hence \(\bx = (t_1 + t_2) \by \in (t_1 + t_2) C\).

  7. Hence \(t_1 C + t_2 C \subseteq (t_1 + t_2) C\).

Together, we have

\[ (t_1 + t_2) C = t_1 C + t_2 C. \]

Theorem 9.12 (Convex set as union of line segments)

Let \(C\) be a convex subset of \(\VV\). Then \(C\) is the union of all the closed line segments connecting the points of the set. In other words

\[ C = \bigcup_{\bx, \by \in C} [\bx, \by]. \]

Proof. Let \(D = \bigcup_{\bx, \by \in C} [\bx, \by]\).

If \(C\) is empty, then \(D\) is also empty. Hence there is nothing to prove. If \(C = \{ \bx \}\) is a singleton, then \(D\) consists of the union of a single line segment

\[ [\bx, \bx] = \{ \bx \}. \]

So \(C = D\). We now consider the case where \(C\) consists of more than one point.

We first show that \(C \subseteq D\).

  1. Let \(\bx \in C\).

  2. Then \([\bx, \bx] = \{ \bx \}\) is a line segment of \(C\).

  3. Hence \(\bx \in [\bx, \bx] \subseteq D\).

  4. Hence \(C \subseteq D\).

We now show the converse.

  1. Let \(\bz \in D\).

  2. Then there exists \(\bx, \by \in C\) such that \(\bz \in [\bx, \by]\).

  3. Then by convexity of \(C\), \([\bx, \by] \subseteq C\).

  4. Hence \(\bz \in [\bx, \by] \subseteq C\).

  5. Hence \(D \subseteq C\).

Together, \(C = D\).

9.2.3. Rays#

Definition 9.7 (Ray)

A ray \(R\) is defined as

\[ R \triangleq \{ \bx_0 + t \bv \ST t \geq 0 \} \]

where \(\bv \neq \bzero\) indicates the direction of ray and \(\bx_0\) is the base or origin of ray.

Theorem 9.13

A ray is convex.

Proof. Let a ray be given as:

\[ R = \{ \bx_0 + t \bv \ST t \geq 0 \}. \]

Let \(\bu, \bv \in R\). Thus, there is \(t_u, t_v \geq 0\) such that:

\[ \bu = \bx_0 + t_u \bv \text{ and } \bv = \bx_0 + t_v \bv. \]

Now, for some \(0 \leq r \leq 1\),

\[\begin{split} r \bu + (1 - r) \bv &= r (\bx_0 + t_u \bv) + (1 - r) (\bx_0 + t_v \bv)\\ &= \bx_0 + (r t_u + (1 - r) t_v) \bv. \end{split}\]

Since \(r t_u + (1 - r) t_v \geq 0\), hence \(r \bu + (1 - r) \bv \in R\).

9.2.4. Balls#

Theorem 9.14

An open ball \(B(\ba, r)\) is convex for any norm \(\| \cdot \| : \VV \to \RR\).

Proof. Recall that an open ball in a normed linear space is defined as:

\[ B(\ba,r) = \{ \bx \in \VV \ST \| \bx - \ba \| < r \}. \]

Let \(\bx, \by \in B(\ba,r)\) and let \(0 \leq t \leq 1\). Then,

\[ t \bx + (1-t)\by - \ba = t (\bx - \ba) + (1-t) (\by - \ba). \]

By triangle inequality:

\[\begin{split} \| t (\bx - \ba) + (1-t) (\by - \ba) \| &\leq \| t (\bx - \ba) \| + \| (1-t) (\by - \ba) \|\\ &= t \| \bx - \ba \| + (1 -t) \| \by - \ba \|\\ &< t r + (1 - t)r = r. \end{split}\]

Thus,

\[ \| t \bx + (1-t)\by - \ba \| < r \implies t \bx + (1-t)\by \in B(\ba,r). \]

Theorem 9.15

A closed ball \(B[\ba, r]\) is convex for any norm \(\| \cdot \| : \VV \to \RR\).

Proof. Recall that a closed ball in a normed linear space is defined as:

\[ B[\ba,r] = \{ \bx \in \VV \ST \| \bx - \ba \| \leq r \}. \]

Let \(\bx, \by \in B(\ba,r)\) and let \(0 \leq t \leq 1\). Then,

\[ t \bx + (1-t)\by - \ba = t (\bx - \ba) + (1-t) (\by - \ba). \]

By triangle inequality:

\[\begin{split} \| t (\bx - \ba) + (1-t) (\by - \ba) \| &\leq \| t (\bx - \ba) \| + \| (1-t) (\by - \ba) \|\\ &= t \| \bx - \ba \| + (1 -t) \| \by - \ba \|\\ &\leq t r + (1 - t)r = r. \end{split}\]

Thus,

\[ \| t \bx + (1-t)\by - \ba \| \leq r \implies t \bx + (1-t)\by \in B[\ba,r]. \]

9.2.5. Convex Combinations#

Definition 9.8 (Convex combination)

We call a point of the form \(\theta_1 \bx_1 + \dots + \theta_k \bx_k\), where \(\theta_1 + \dots + \theta_k = 1\) and \(\theta_i \geq 0, i=1,\dots,k\), a convex combination of the points \(\bx_1, \dots, \bx_k\).

It is like a weighted average of the points \(\bx_i\). A weights \(\theta_i\) in a convex combination can be interpreted as probabilities or proportions.

Example 9.2 (Center of mass)

Consider a system of particles \(p_i\), \(i=1,\dots,n\) each with mass \(m_i\) and location in space as \(\bx_i\). The center of mass \(\bx\) satisfies the equation:

\[ \sum_{i=1}^n m_i (\bx_i - \bx) = 0. \]

Solving this equation gives us:

\[ \bx = \sum_{i=1}^n \frac{m_i}{m} \bx_i \]

where \(m = \sum_{i=1}^n m_i\).

If we assign \(\theta_i = \frac{m_i}{m}\), we notice that \(\theta_i \geq 0\) and \(\sum_{i=1}^n \theta_i = 1\). We can now write the center of mass as:

\[ \bx = \sum_{i=1}^n \theta_i \bx_i \]

which is a convex combination of the locations \(\bx_i\) where \(\theta_i\) gives the proportion of contribution of each particle according to its mass.

Remark 9.2 (Convex combinations and unit simplex)

We recall that the unit simplex in \(\RR^n\) is given by

\[ \Delta_n = \{ \bt \in \RR^n \ST \langle \bt, \bone \rangle = 1, \bt \succeq \bzero \} = \{\bt \in \RR^n \ST t_1 + \dots + t_n = 1, t_1, \dots, t_n \geq 0 \}. \]

Thus, the coefficients for convex combinations of \(n\) points are drawn from \(\Delta_n\).

Theorem 9.16 (Closure under convex combinations)

A set is convex if and only if it contains all convex combinations of its points.

Let \(\VV\) be a real vector space and \(C\) be a subset of \(\VV\). Then, \(C\) is convex if and only if for any \(m \in \Nat\), for any \(\bx_1, \dots, \bx_m \in C\), and for every \(\bt \in \Delta_m\), \(t_1 \bx_1 + \dots + t_m \bx_m \in C\).

Proof. We know that a set \(C\) is convex is equivalent to saying that it contains all 2 point convex combinations; i.e, for any \(\bx_1, \bx_2 \in C\), \(t_1, t_2 \geq 0\) and \(t_1 + t_2 = 1\),

\[ t_1 \bx_1 + t_2 \bx_2 \in C. \]

We first show that if \(C\) is convex, it contains all its (finite) convex combination by induction.

  1. By definition \(C\) contains all its 2 point convex combinations.

  2. As induction hypothesis, assume that \(C\) contains all convex combinations of \(m-1\) or fewer points where \(m > 2\).

  3. Consider a convex combination of \(m\) points

    \[ \sum_{i=1}^m t_i \bx_i \]

    where \(\bx_i \in C\), \(t_i \geq 0\), \(\sum t_i = 1\).

  4. Since \(\sum t_1 = 1\), hence at least one of \(t_i < 1\).

  5. Without loss of generality, assume \(t_m < 1\).

  6. Note that \(t_m < 1\) means that \(1 - t_m > 0\).

  7. Define \(\by = \sum_{i=1}^{m-1} t'_i \bx_i\) where \(t'_i = \frac{t_i}{1 - t_m}\).

  8. Note that \(t'_i \geq 0\). Also, \(\sum_{i=1}^{m-1} t'_i = 1\) since \(\sum_{i=1}^{m-1} t_i = 1 - t_m\).

  9. Thus, \(\by\) is an \(m-1\) point convex combination of \(C\).

  10. By induction hypothesis, \(\by \in C\).

  11. Now, \((1-t_m) \by = \sum_{i=1}^{m-1} t_i \bx_i\).

  12. Hence, \(\bx = (1 - t_m) \by + t_m \bx_m\).

  13. It is a 2 point convex combination of \(\by\) and \(\bx_m\).

  14. Since both \(\by, \bx_m \in C\), hence \(\bx \in C\).

  15. Thus, \(C\) contains all its \(m\) point convex combinations.

For the converse, note that if \(C\) contains all its convex combinations, then it contains, in particular, all its two point convex combinations. Hence, \(C\) is convex.

Theorem 9.17

A convex combination of convex combinations is a convex combination.

Proof. Let \(S \subseteq \VV\). Note that \(S\) is arbitrary (no convexity assumed).

  1. Consider \(n\) points \(\by_i\), \(i=1,\dots, n\) described as below.

  2. Let \(\by_i = \sum_{j=1}^{m_j}t_{i,j} \bx_{i,j}\) be convex combinations of \(m_j\) points:

    • \(\bx_{i,1}, \dots, \bx_{i,m_j} \in S\).

    • \(t_{i,j} \geq 0\).

    • \(\sum_{j=1}^{m_j} t_{i, j} = 1\).

  3. Consider the convex combination \(\by = \sum_{i=1}^n r_i \by_i\).

    • \(r_i \geq 0\).

    • \(\sum r_i = 1\).

  4. We need to show that \(\by\) is a convex combination of points of \(S\).

Towards this:

\[\begin{split} \by &= \sum_{i=1}^n r_i \by_i\\ &= \sum_{i=1}^n r_i \sum_{j=1}^{m_j}t_{i,j} \bx_{i,j}\\ &= \sum_{i=1}^n \sum_{j=1}^{m_j} r_i t_{i,j} \bx_{i,j}. \end{split}\]

Consider the terms:

\[ s_{i, j} = r_i t_{i,j}. \]

Since \(r_i \geq 0\) and \(t_{i, j} \geq 0\), hence \(s_{i, j } \geq 0\).

Now, consider their sum:

\[\begin{split} \sum_{i=1}^n \sum_{j=1}^{m_j} s_{i, j} &= \sum_{i=1}^n \sum_{j=1}^{m_j} r_i t_{i,j} \\ &= \sum_{i=1}^n r_i \sum_{j=1}^{m_j} t_{i,j}\\ &= \sum_{i=1}^n r_i = 1\\ \end{split}\]

Thus, \(\sum_{i,j} s_{i, j} = 1\).

Hence,

\[ \by = \sum_{i,j} s_{i, j} x_{i, j} \]

is a convex combination of points of \(S\).

9.2.6. Convex Hull#

Definition 9.9 (Convex hull)

The convex hull of an arbitrary set \(S \subseteq \VV\) denoted as \(\ConvexHull(S)\), is the set of all convex combinations of points in \(S\).

\[ \ConvexHull(S) = \{ \theta_1 \bx_1 + \dots + \theta_k \bx_k | \bx_k \in S, \theta_i \geq 0, i = 1,\dots, k, \theta_1 + \dots + \theta_k = 1\}. \]
../_images/pic_convex_hull_random_2d_points.png

Fig. 9.2 The convex hull of a set of random points on the 2D plane#

Property 9.1 (Convexity of convex hull)

The convex hull \(\ConvexHull(S)\) of a set \(S\) is always convex.

Proof. Let \(\bx, \by \in \ConvexHull(S)\), \(t \in [0,1]\) and the point \(\bz = t \bx + (1-t) \by\). We need to show that \(\bz \in \ConvexHull(S)\).

  1. \(\bx, \by\) are convex combinations of points of \(S\).

  2. \(\bz\) is a convex combination of \(\bx\) and \(\by\).

  3. Hence, \(\bz\) is a convex combination of convex combinations of points in \(S\).

  4. By Theorem 9.17, a convex combination of convex combinations is a convex combination.

  5. Thus, \(\bz\) is a convex combination of points of \(S\).

  6. But \(\ConvexHull(S)\) contains all convex combinations of points of \(S\) by definition.

  7. Hence, \(\bz \in \ConvexHull(S)\).

  8. Thus, \(\ConvexHull(S)\) is convex.

Property 9.2 (Affine hull of convex hull)

Let \(\VV\) be a finite dimensional real vector space. Let \(S\) be a nonempty subset of \(\VV\). Let \(C = \convex(S)\). Then

\[ \affine S = \affine C. \]

In other words, the affine hull of a set and its convex hull are identical.

Proof. By using a translation argument if necessary, without loss of generality, we assume that \(\bzero \in S\).

  1. Then both \(\affine S\) and \(\affine C\) are linear subspaces.

  2. Since \(S \subseteq C\), hence \(\affine S \subseteq \affine C\).

  3. For the converse, assume that \(m = \affine C\).

  4. Let \(\bx_1, \dots, bx_m \in C\) be \(m\) linearly independent vectors spanning \(\affine C\).

  5. Then for every \(\bx \in \affine C\), there exist scalars \(t_1, \dots, t_m\) so that

    \[ \bx = \sum_{i=1}^m t_i \bx_i. \]
  6. By definition of convex hull, each \(\bx_i \in C\) is a convex combination of points in \(S\).

  7. Hence every \(\bx \in \affine C\) is a linear combination of points in \(S\).

  8. Hence \(\affine C \subseteq \affine S\).

Theorem 9.18

The convex hull of a set \(S\) is the smallest convex set containing it. In other words, let \(C\) be any convex set such that \(S \subseteq C\). Then \(\ConvexHull(S) \subseteq C\).

Proof. Let \(C\) be a convex set such that \(S \subseteq C\).

  1. Let \(\bx \in \ConvexHull(S)\).

  2. Then, \(\bx\) is a convex combination of points of \(S\).

  3. \(C\) is convex and \(S \subseteq C\).

  4. Hence, \(C\) contains every convex combination of points of \(S\).

  5. Thus, in particular \(\bx \in C\).

  6. Since \(\bx \in \ConvexHull(S)\) was arbitrary, hence \(\ConvexHull(S) \subseteq C\).

We could have started as defining the convex hull of \(S\) being the smallest convex set containing \(S\) and arrived at the conclusion that \(\ConvexHull(S)\) contains all convex combinations of \(S\). Some authors prefer to define \(\ConvexHull(S)\) as the smallest convex set containing \(S\). Both definitions are equivalent.

9.2.6.1. Carathéodory Theorem#

Theorem 9.19 (Carathéodory theorem)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S \subseteq \VV\). Let \(\bx \in \ConvexHull(S)\).

Then, there exists a set of \(n+1\) points \(\bx_0, \dots, \bx_n \in S\) such that

\[ \bx \in \ConvexHull (\{ \bx_0, \dots, \bx_n\}); \]

i.e., there exists a \(\bt \in \Delta_{n+1}\) such that

\[ \bx = \sum_{i=0}^n t_i \bx_i. \]

Proof. We note that \(\bx \in \ConvexHull(S)\).

  1. Thus, there exists a set of \(k+1\) points in \(S\) and \(\bt \in \Delta_{k+1}\) such that

    \[ \bx = \sum_{i=0}^k t_i \bx_i. \]
  2. We can assume \(t_i > 0\) for all \(i=0, \dots, k\) since otherwise, we can drop the vectors corresponding to the zero coefficients from the convex combination.

  3. If \(k \leq n\), there is nothing to prove.

  4. Hence, consider the case where \(k > n\).

  5. We now describe a process which can reduce the number of points in the convex combination by one.

  6. The \(k\) vectors \(\bx_1 - \bx_0, \dots, \bx_k - \bx_0\) are linearly dependent as \(k > n\) and \(\VV\) is \(n\)-dimensional.

  7. Thus, there is a nontrivial linear combination of these vectors

    \[ r_1 (\bx_1 - \bx_0) + \dots + r_k (\bx_k - \bx_0) = \bzero. \]
  8. Let \(r_0 = -r_1 - \dots - r_k\). Then, we have a nontrivial combination

    \[ \sum_{i=0}^k r_i \bx_i = \bzero \]

    with \(\sum r_i = 0\).

  9. In particular, there exists at least one index \(j\) for which \(r_j < 0\).

  10. Let \(\alpha \geq 0\).

  11. Then,

    \[ \bx = \sum_{i=0}^k t_i \bx_i + \alpha \sum_{i=0}^k r_i \bx_i = \sum_{i=0}^k (t_i + \alpha r_i) \bx_i \]

    with \(\sum (t_i + \alpha r_i) = \sum t_i + \alpha \sum r_i = 1\).

  12. Thus, it is a convex combination for \(\bx\) if \(t_i + \alpha r_i \geq 0\) for every \(i=0, \dots, k\).

  13. Let

    \[ \alpha = \underset{i \ST r_i < 0}{\min}\left \{ - \frac{t_i}{r_i} \right \}. \]
  14. \(\alpha\) is well defined since there is at least one \(r_j < 0\). Let \(j\) be the index for which \(\alpha\) was obtained.

  15. Then, \(t_j + \alpha r_j = 0\).

  16. Also, we can see that \(t_i + \alpha r_i \geq 0\) for all \(i=0,\dots,k\).

  17. Thus, we have found a convex combination for \(\bx\) where the coefficient for \(\bx_j\) is 0.

  18. Thus, we have obtained a convex combination for \(\bx\) with \(k-1\) points.

  19. Repeating this process up to \(k-n\) times, we can obtain a convex combination of \(\bx\) consisting of \(n+1\) or less points.

9.2.7. Dimension#

Definition 9.10 (Dimension of a convex set)

The dimension of a convex set is defined to be the dimension of its affine hull.

If \(C\) is a convex set, then:

\[ \dim C = \dim \affine C. \]

Recall that the dimension of an affine set is equal to the dimension of the linear subspace associated with it (Definition 4.163).

  1. A circle will have a dimension of 2 even if it is in \(\RR^3\).

  2. A sphere will have a dimension of three.

9.2.8. Simplices#

Theorem 9.20 (Convex hull of a finite set of points)

Let \(S = \{ \bx_0, \dots, \bx_m \}\) be a finite set of points of \(\VV\). Then, \(\ConvexHull(S)\) consists of all the points of the form

\[ t_0 \bx_0 + \dots t_m \bx_m, \quad t_0 \geq 0, \dots, t_m \geq 0, \sum_{i=0}^m t_i = 1. \]

In \(\RR^n\), this is known as a polytope.

A Simplex is a convex hull of a finite set of affine independent points. The simplex provides a powerful coordinate system for the points within it in terms of barycentric coordinates.

Definition 9.11 (\(k\)-simplex)

Let \(k+1\) points \(\bv_0, \dots, \bv_k \in \VV\) be affine independent.

The simplex determined by them is given by

\[ C = \ConvexHull \{ \bv_0, \dots, \bv_k\} = \{ t_0 \bv_0 + \dots + t_k \bv_k \ST \bt \succeq 0, \bone^T \bt = 1\} \]

where \(\bt = [t_1, \dots, t_k]^T\) and \(\bone\) denotes a vector of appropriate size \((k)\) with all entries one.

In other words, \(C\) is the convex hull of the set \(\{\bv_0, \dots, \bv_k\}\).

A simplex is a convex set since it is a convex hull of its vertices. \(k\) stands for the dimension of the simplex. Recall that the dimension of a convex set is the dimension of its affine hull.

Example 9.3 (Simplex examples)

In \(\RR^n\):

  • A 0-simplex is a point.

  • A 1-simplex is a line segment (2 points).

  • A 2-simplex is a triangle (3 points).

  • A 3-simplex is a tetrahedron (4 points).

  • A 4-simplex is a 5-cell (5 points).

Theorem 9.21 (Barycentric coordinates)

Each point of a \(k\) simplex is uniquely expressible as a convex combination of the vertices.

Proof. Let \(C = \ConvexHull\{\bv_0, \bv_1, \dots, \bv_k \}\).

  1. Let \(\bv \in C\).

  2. Then, \(\bv = \sum_{i=0}^k t_i \bv_i\) with \(t_i \geq 0\) and \(\sum t_i = 1\).

  3. For contradiction, assume there was another representation: \(\bv = \sum_{i=0}^k r_i \bv_i\) with \(r_i \geq 0\) and \(\sum r_i = 1\).

  4. Then,

    \[ \sum_{i=0}^k t_i \bv_i = \sum_{i=0}^k r_i \bv_i \implies \sum_{i=0}^k (t_i - r_i) \bv_i = \bzero. \]
  5. But \(\{\bv_0, \bv_1, \dots, \bv_k \}\) are affine independent.

  6. Hence, \(t_i = r_i\).

  7. Thus, the representation is unique.

Definition 9.12 (Simplex midpoint)

The point \(\sum_{i=0}^k \frac{1}{k+1}{\bv_i}\) in a simplex \(C = \ConvexHull\{\bv_0, \dots, \bv_k \}\) is known as its midpoint or barycenter.

Theorem 9.22

The dimension of a convex set \(C\) is the maximum of the dimensions of the various simplices included in \(C\).

Proof. We need to show that there is a simplex \(S \subset C\) such that \(\dim S = \dim C\).

  1. Let \(A\) be any finite affine independent subset of \(C\).

  2. Since \(C\) is convex, hence \(A \subseteq C \implies \ConvexHull(A) \subseteq C\).

  3. Thus, \(C\) contains the simplices constructed from any set of finite affine independent points in \(C\).

  4. Thus, if \(A = \{\bv_0, \dots, \bv_k\}\) is a set of \(k+1\) affine independent points of \(C\), then \(\ConvexHull(A) \subseteq C\) implies that \(k \leq \dim C\).

  5. Thus, if \(S\) is a \(k\)-simplex such that \(S \subseteq C\), then \(\dim S = k \leq \dim C\).

  6. Let \(m\) be the maximum of the dimensions of the various simplices contained in \(C\).

  7. Then, there exist affine independent points \(\bv_0, \dots, \bv_m \in C\) such that the simplex \(S = \ConvexHull\{ \bv_0, \dots, \bv_m\} \subseteq C\).

  8. Let \(M\) be the affine hull of \(S\); i.e. \(M = \affine S\).

  9. Then, \(\dim M = m\) and \(M \subseteq \affine C\).

  10. If \(C \setminus M\) were nonempty, then there would be an element \(\bv \in C \setminus M\) which would be affine independent of \(\{ \bv_0, \dots, \bv_m\}\).

  11. That would lead to a set of \(m+2\) affine independent points in \(C\). That would mean that \(C\) contains a simplex of dimension \(m+1\). A contradiction.

  12. Hence, \(C \setminus M = \EmptySet\).

  13. Thus, \(C \subseteq M\).

  14. Since \(\affine C\) is the smallest affine set that contains \(C\), hence \(\affine C = M\).

  15. Thus, \(\dim C = m\).

9.2.9. Symmetric Reflections#

The symmetric reflection of a convex set is convex since convexity is preserved under scalar multiplication. See Theorem 9.30 below.

If a symmetric convex set contains a nonzero vector \(\bx\), then it contains the entire line segment between \(-\bx\) and \(\bx\).

9.2.10. Infinite Convex Combinations#

We can generalize convex combinations to include infinite sums.

Theorem 9.23

Let \(\theta_1, \theta_2, \dots\) satisfy

\[ \theta_i \geq 0, i = 1,2,\dots, \quad \sum_{i=1}^{\infty} \theta_i = 1, \]

and let \(\bx_1, \bx_2, \dots \in C\), where \(C \subseteq \VV\) is convex. Then

\[ \sum_{i=1}^{\infty} \theta_i \bx_i \in C, \]

if the series converges.

We can generalize it further to density functions.

Theorem 9.24

Let \(p : \VV \to \RR\) satisfy \(p(x) \geq 0\) for all \(x \in C\) and

\[ \int_{C} p(x) d x = 1 \]

Then

\[ \int_{C} p(x) x d x \in C \]

provided the integral exists.

Note that \(p\) above can be treated as a probability density function if we define \(p(x) = 0 \Forall x \in \VV \setminus C\).

9.2.11. Convexity Preserving Operations#

In the following, we will discuss several operations which transform a convex set into another convex set, and thus preserve convexity.

Understanding these operations is useful for determining the convexity of a wide variety of sets.

Usually, it is easier to prove that a set is convex by showing that it is obtained by a convexity preserving operation from a convex set compared to directly verifying the convexity property i.e.

\[ t \bx_1 + (1 - t) \bx_2 \in C \Forall \bx_1, \bx_2 \in C, t \in [0,1]. \]

9.2.11.1. Intersection and Union#

Theorem 9.25 (Intersection of convex sets)

If \(S_1\) and \(S_2\) are convex sets then \(S_1 \cap S_2\) is convex.

Proof. Let \(\bx_1, \bx_2 \in S_1 \cap S_2\). We have to show that

\[ t \bx_1 + (1 - t) \bx_2 \in S_1 \cap S_2, \Forall t \in [0,1]. \]

Since \(S_1\) is convex and \(\bx_1, \bx_2 \in S_1\), hence

\[ t \bx_1 + (1 - t) \bx_2 \in S_1, \Forall t \in [0,1]. \]

Similarly

\[ t \bx_1 + (1 - t) \bx_2 \in S_2, \Forall t \in [0,1]. \]

Thus

\[ t \bx_1 + (1 - t) \bx_2 \in S_1 \cap S_2, \Forall t \in [0,1]. \]

which completes the proof.

We can generalize it further.

Theorem 9.26 (Intersection of arbitrary collection of convex sets)

Let \(\{ A_i\}_{i \in I}\) be a family of sets such that \(A_i\) is convex for all \(i \in I\). Then \(\cap_{i \in I} A_i\) is convex.

Proof. Let \(\bx_1, \bx_2\) be any two arbitrary elements in \(\cap_{i \in I} A_i\).

\[\begin{split} &\bx_1, \bx_2 \in \cap_{i \in I} A_i\\ \implies & \bx_1, \bx_2 \in A_i \Forall i \in I\\ \implies &t \bx_1 + (1 - t) \bx_2 \in A_i \Forall t \in [0,1] \Forall i \in I \text{ since $A_i$ is convex }\\ \implies &t \bx_1 + (1 - t) \bx_2 \in \cap_{i \in I} A_i. \end{split}\]

Hence \(\cap_{i \in I} A_i\) is convex.

Corollary 9.1 (Arbitrary intersection of closed half spaces)

Let \(I\) be an index set. Let \(\ba_i \in \VV\) and \(b_i \in \RR\) for every \(i \in I\). Then, the set:

\[ C = \{ \bx \in \VV \ST \langle \bx, \ba_i \rangle \leq b_i \Forall i \in I\} \]

is convex.

Proof. Since each of the half spaces is convex, hence so is their intersection.

This result is applicable for open half spaces and hyperplanes too too. It also applies for a mixture of hyperplanes and half-spaces.

Corollary 9.2

The solution set of a system of linear equations and inequalities in \(\RR^n\) is convex.

Proof. We proceed as follows:

  • The solution set of each linear equation is a hyperplane.

  • The solution set of each linear inequality is a half-space (closed or open).

  • The solution set of a system of linear equations and inequalities is the intersection of these hyperplanes and half-spaces.

  • Each hyperplane and each half-space is convex.

  • Hence, their intersection is convex.

Theorem 9.27 (Intersection and union of two sets)

Let \(C_1\) and \(C_2\) be convex in \(\VV\). Then, the largest convex set contained in both is \(C_1 \cap C_2\). And, the smallest convex set containing both is \(\ConvexHull (C_1 \cup C_2)\).

Proof. Let \(C\) be a convex set contained in both \(C_1\) and \(C_2\). Then, \(C \subseteq C_1 \cap C_2\). But \(C_1 \cap C_2\) is convex (Theorem 9.25). Hence, \(C_1 \cap C_2\) is the largest convex set contained in both \(C_1\) and \(C_2\).

Let \(C\) be a convex set which contains both \(C_1\) and \(C_2\). Then, \(C_1 \cup C_2 \subseteq C\). The smallest convex set containing \(C_1 \cup C_2\) is its convex hull given by \(\ConvexHull(C_1 \cup C_2)\) (Theorem 9.18).

Theorem 9.28 (Intersection and union of arbitrary sets)

Let \(I\) be an index set and \(\FFF = \{ C_i \}_{i \in I}\) be a family of convex sets in \(\VV\). Then, the largest convex set contained in every set of \(\FFF\) is:

\[ \bigcap_{i \in I}C_i. \]

And, the smallest convex set containing every set of \(\FF\) is

\[ \ConvexHull \left (\bigcup_{i \in I} C_i \right ). \]

Proof. Let \(C\) be a convex set contained in every set of \(\FFF\). Then, \(C \subseteq \bigcap_{i \in I}C_i\). But \(\bigcap_{i \in I}C_i\) is convex (Theorem 9.26). Hence, \(\bigcap_{i \in I}C_i\) is the largest convex set contained in every set of \(\FFF\).

Let \(C\) be a convex set which contains every set of \(\FF\). Then, \(\bigcup_{i \in I} C_i \subseteq C\). The smallest convex set containing every set of \(\FF\) is its convex hull given by \(\ConvexHull(\bigcup_{i \in I} C_i)\) (Theorem 9.18).

9.2.11.2. Affine Functions#

Let us start with some simple results.

Theorem 9.29 (Convexity and translation)

Convexity is preserved under translation.

\(C\) (a subset of \(\VV\)) is convex if and only if \(C + \ba\) is convex for every \(\ba \in \VV\).

Proof. Let \(C \subseteq \VV\).

  1. Assume \(C\) is convex.

  2. Then, for every \(\bx, \by \in C\) and every \(t \in [0,1]\), \(t \bx + (1-t) \by \in C\).

  3. Let \(\ba \in \VV\).

  4. Let \(\bu, \bv \in C + \ba\).

  5. Then, \(\bu = \bx + \ba\) and \(\bv = \by + \ba\) for some \(\bx, \by \in C\).

  6. Then,

    \[\begin{split} t \bu + (1-t) \bv &= t (\bx + \ba) + (1-t ) (\by + \ba)\\ &= t \bx + (1-t) \by + \ba. \end{split}\]
  7. But \(t \bx + (1-t) \by \in C\) since \(C\) is convex.

  8. Then, \(t \bx + (1-t) \by + \ba \in C + \ba\).

  9. Thus, \(t \bu + (1-t) \bv \in C + \ba\).

  10. Thus, \(C + \ba\) is convex.

We can follow the same argument in the opposite direction to establish that \(C + \ba\) is convex implies \(C\) is convex.

Theorem 9.30 (Convexity and scalar multiplication)

Convexity is preserved under scalar multiplication.

\(C\) (a subset of \(\VV\)) is convex if and only if \( \alpha C\) is convex for every \(\alpha \in \RR\).

Proof. Let \(C \subseteq \VV\).

  1. Assume \(C\) is convex.

  2. Let \(\alpha \in \RR\).

  3. Let \(\bu, \by \in \alpha C\).

  4. Then, \(\bu = \alpha \bx\) and \(\bv = \alpha \by\) for some \(\bx, \by \in C\).

  5. Let \(t \in [0,1]\).

  6. \(t\bu + (1-t)\bv = \alpha (t \bx + (1-t)\by)\).

  7. But \(t \bx + (1-t)\by \in C\) since \(C\) is convex.

  8. Hence, \(\alpha (t \bx + (1-t)\by)\) in \(\alpha C\).

  9. Hence, \(t\bu + (1-t)\bv \in \alpha C\).

  10. Thus, \(\alpha C\) is convex.

Similar argument in opposite direction establishes that \(\alpha C\) is convex implies \(C\) is convex.

Recall that an affine function \(f : \VV \to \EE\) from a real vector space \(\VV\) to another real vector space \(\EE\) is a function which satisfies

\[ f(t \bx + (1-t)\by) = tf(\bx) + (1 -t) f(\by) \]

for every \(t \in \RR\).

Recall from Theorem 4.193 that an affine function can be written as a linear transformation followed by a translation:

\[ f(\bx) = T (\bx) + \bb \]

where \(T\) is a linear operator.

Example 9.4

An affine function \(f : \RR^n \to \RR^m\) takes the form of a matrix multiplication plus a vector addition:

\[ f(\bx) = \bA \bx + \bb \]

where \(\bA \in \RR^{m \times n}\) and \(\bb \in \RR^m\).

Theorem 9.31 (Image under affine function)

Let \(S \subseteq \VV\) be convex and \(f : \VV \to \EE\) be an affine function. Then the image of \(S\) under \(f\) given by

\[ f(S) = \{ f(\bx) | \bx \in S\} \]

is a convex set.

Proof. We proceed as follows:

  1. Let \(\bu, \bv \in f(S)\).

  2. Then, \(\bu = f(\bx)\) and \(\bv = f(\by)\) for some \(\bx, \by \in S\).

  3. Let \(0 \leq t \leq 1\).

  4. Then, \(\bz = t \bx + (1-t)\by \in S\) since \(S\) is convex.

  5. Since \(f\) is affine, hence

    \[ f(\bz) = f(t \bx + (1-t)\by ) = t f(\bx) + (1-t)f(\by) = t \bu + (1-t)\bv. \]
  6. Since \(\bz \in S\), hence \(f(\bz) = t \bu + (1-t)\bv \in f(S)\).

  7. We have shown that for any \(\bu, \bv \in f(S)\) and any \(0 \leq t \leq 1\), \(t \bu + (1-t)\bv \in f(S)\).

  8. Thus, \(f(S)\) is convex.

It applies in the reverse direction also.

Theorem 9.32 (Inverse image under affine function)

Let \(f : \VV \to \EE\) be affine and \(S \subseteq \EE\) be convex. Then the inverse image of \(S\) under \(f\) given by

\[ f^{-1}(S) = \{ \bx \in \VV \ST f(\bx) \in S\} \]

is convex.

Proof. Denote \(R = f^{-1}(S)\). We need to show that if \(S\) is convex then \(R\) is convex too.

We proceed as follows:

  1. Let \(\bx, \by \in R\).

  2. Let \(\bu = f(\bx)\) and \(\bv = f(\by)\).

  3. \(\bu, \bv \in S\).

  4. Let \(0 \leq t \leq 1\).

  5. Then, \(\bw = t \bu + (1-t)\bv \in S\) since \(S\) is convex.

  6. Let \(\bz = t \bx + (1-t) \by\).

  7. Since \(f\) is affine, hence

    \[ \bw = t \bu + (1-t)\bv = t f(\bx) + (1-t) f(\by) = f(t \bx + (1-t) \by) = f(\bz). \]
  8. Since \(\bw \in S\), hence \(\bz \in R\) as \(\bw = f(\bz)\).

  9. We have shown that for any \(\bx, \by \in R\) and any \(0 \leq t \leq 1\), \(t \bx + (1-t)\by \in R\).

  10. Thus, \(R\) is convex.

Example 9.5 (Affine functions preserving convexity)

Let \(S \in \RR^n\) be convex.

  • For some \(\alpha \in \RR\), \(\alpha S\) is convex. This is the scaling operation.

  • For some \(\ba \in \RR^n\), \(S + \ba\) is convex. This is the translation operation.

  • Let \(n = m + k\). Then, let \(\RR^n = \RR^m \times \RR^k\). A vector \(\bx \in S\) can be written as \(\bx = (\bx_1, \bx_2)\) where \(\bx_1 \in \RR^m\) and \(\bx_2 \in \RR^k\). Then

    \[ T = \{ \bx_1 \in \RR^m \ST (\bx_1, \bx_2) \in S \text{ for some } \bx_2 \in \RR^k\} \]

    is convex. This is the projection operation. It projects vectors from \(\RR^n\) to \(\RR^m\) by dropping last \(k\) entries.

Example 9.6 (System of linear equations)

Consider the system of linear equations \(\bA \bx = \by\) where \(A \in \RR^{m \times n}\).

If \(\by\) ranges over a convex set, then the corresponding set of solutions also ranges over a convex set due to Theorem 9.32.

The nonnegative orthant \(\RR^n_+\) is a convex set.

Let \(Y = \RR^m_+ + \ba\) for some \(\ba \in \RR^m\); i.e.,

\[ Y = \{ \by \ST \by \succeq \ba \}. \]

Then, \(\bA^{-1} Y\) is the set of vectors satisfying the inequality

\[ \bA \bx \succeq \ba. \]

Thus, the solution set of a system of linear inequalities of the form \(\bA \bx \succeq \ba\) is convex.

Now, if \(X = \RR^n_+\), then \(\bA X\) is the set of vectors \(\by \in \RR^m\) such that the equation \(\bA \bx = \by\) has a nonnegative solution (\(\bx \succeq \bzero\)). Since \(X\) is convex, so is \(\bA X\).

Theorem 9.33 (Orthogonal projection of convex set)

The orthogonal projection of a convex set \(C\) on a subspace \(V\) is another convex set.

Proof. We recall that orthogonal projection is a linear mapping and thus an affine function. By Theorem 9.31, image of a convex set under an affine function is convex. Hence proved.

9.2.11.3. Set Addition#

Theorem 9.34 (Convexity and set addition)

Let \(C_1\) and \(C_2\) be two convex subsets of \(\VV\). Then \(C_1 + C_2\) is convex.

Proof. We proceed as follows:

  1. Let \(\bx, \by \in C_1 + C_2\).

  2. Then, \(\bx = \bx_1 + \bx_2\) for some \(\bx_1 \in C_1\) and some \(\bx_2 \in C_2\).

  3. Similarly, \(\by = \by_1 + \by_2\) for some \(\by_1 \in C_1\) and some \(\by_2 \in C_2\).

  4. Let \(0 \leq t \leq 1\).

  5. Then:

    \[ t \bx + (1 - t) \by = t (\bx_1 + \bx_2) + (1-t)(\by_1 + \by_2) = t \bx_1 + (1 - t) \by_1 + t \bx_2 + (1 - t) \by_2. \]
  6. But, \(\bz_1 = t \bx_1 + (1 - t) \by_1 \in C_1\) since \(C_1\) is convex.

  7. Similarly, \(\bz_2 = t \bx_2 + (1 - t) \by_2 \in C_2\) since \(C_2\) is convex.

  8. Hence, \(t \bx + (1 - t) \by = \bz_1 + \bz_2 \in C_1 + C_2\).

  9. Thus, \(C_1 + C_2\) is convex.

One way to think geometrically about set addition is as the union of all translates of \(C_1\) given by \(C_1 + \bx\) as \(\bx\) varies over \(C_2\).

Theorem 9.35

A set \(C\) is convex if and only if

\[ (1-t) C + t C = C \Forall t \in [0,1]. \]

Proof. Assume \(C\) is convex:

  1. \((1-t) C + t C = \{ t \bx + (1-t) \by \ST \bx, \by \in C \}\).

  2. Thus, \((1-t) C + t C \subseteq C\).

  3. For every \(\bx \in C\), \((1-t)\bx \in (1-t)C\) and \(t \bx \in t C\).

  4. Thus, \((1-t)\bx + t \bx = \bx \in (1-t) C + t C\).

  5. Thus, \(C \subseteq (1-t) C + t C\).

  6. Combining, we get \((1-t) C + t C = C\).

Assume \((1-t) C + t C = C\) for every \(t \in [0,1]\).

  1. Let \(\bx, \by \in C\) and \(t\in [0,1]\).

  2. Then, \((1-t)\bx \in (1-t)C\) and \(t \by \in t C\).

  3. Hence, \((1-t)\bx + t \by \in (1-t) C + t C = C\).

  4. Thus, \(C\) is convex.

Theorem 9.36 (Convexity and linear combination)

Convexity is preserved under linear combinations.

Let \(C_1, \dots, C_k\) be convex. Let \(t_1, \dots, t_k \in \RR\). Then, their linear combination:

\[ C = t_1 C_1 + \dots + t_k C_k \]

is convex.

Proof. Due to Theorem 9.30, \(t_i C_i\) are convex for \(i=1,\dots,k\).

By (finite) repeated application of Theorem 9.34, their sum is also convex.

Theorem 9.37 (Nonnegative scalar multiplication distributive law)

Let \(C\) be convex and \(t_1, t_2 \geq 0\). Then

\[ (t_1 + t_2)C = t_1 C + t_2 C. \]

Proof. From Theorem 4.16, we know that:

\[ (t_1 + t_2)C \subseteq t_1 C + t_2 C. \]

We now show that \(t_1 C + t_2 C \subseteq (t_1 + t_2)C\).

  1. If both \(t_1 = t_2 = 0\), then we have trivial equality.

  2. If either of \(t_1\) or \(t_2\) is 0, then also we have trivial equality.

  3. Now, consider the case \(t_1, t_2 > 0\).

  4. Define \(t = t_1 + t_2 > 0\) and \(r = \frac{t_1}{t}\).

  5. Then, \(1-r = \frac{t_2}{t}\).

  6. Then, since \(C\) is convex, hence \(r C + (1-r) C \subseteq C\).

  7. Multiplying by \(t\) on both sides, we get: \(t_1 C + t_2C \subseteq (t_1 + t_2) C\).

For the special case of \(t_1 = r\) and \(t_2 = 1 - r\) with \(r \in [0,1]\), we get:

\[ C = r C + (1- r)C. \]

Some implications are \(C + C = 2C\), \(C+C+C=3C\) and so forth if \(C\) is convex.

Theorem 9.38 (Convex combinations over arbitrary unions)

Let \(I\) be an index set and \(\FFF = \{C_i \}_{i \in I}\) be a family of convex sets in \(\VV\). Let \(C\) be given as:

\[ C = \ConvexHull\left (\bigcup_{i \in I} C_i \right); \]

i.e., \(C\) is the convex hull of the union of the family of sets \(\FFF\). Then,

\[ C = \bigcup \left \{\sum_{i \in I} t_i C_i \right \} \]

where the union is taken over all finite convex combinations (i.e. over all nonnegative choices of \(t_i\) such that only finitely many \(t_i\) are nonzero and they add up to 1).

Proof. We proceed as follows:

  1. Let \(\bx \in C\).

  2. Then, \(\bx\) is a convex combination of elements in \(\bigcup_{i \in I} C_i\).

  3. Thus, \(\bx = \sum_{i=1}^m a_i \by_i\) where \(\by_i \in \bigcup_{i \in I} C_i\), \(a_i \geq 0\) and \(\sum a_i = 1\).

  4. Drop all the terms from \(\bx\) where \(a_i = 0\).

  5. If \(\by_i, \by_j\) belong to some same \(C_k\), then, we can replace \(a_i \by_i + a_j \by_j\) with some \(a \by = a_i \by_i + a_j \by_j\) where \(a = a_i + a_j\). Note that, with these assumptions, \(\by = \frac{a_i}{a_i + a_j} \by_i + \frac{a_j}{a_i + a_j} \by_j\) is a convex combination of \(\by_i\) and \(\by_j\), hence \(\by \in C_k\) since \(C_k\) is convex.

  6. Thus, terms from a single \(C_k\) in \(\bx\) can be reduced by a single term.

  7. Thus, we can simplify \(\bx\) such that

    \[ \bx = \sum_{j=1}^p b_j \bx_j \]

    such that each \(\bx_j\) belongs to a different \(C_{i_j}\) with \(i_j \in I\), \(b_j > 0\) and \(\sum b_j = 1\).

  8. Thus, \(C\) is a union of finite convex combinations of the form

    \[ b_1 C_{i_1} + \dots + b_p C_{i_p} \]

    where \(i_1, \dots, i_p \in I\) are different indices and \(b_j > 0\), \(\sum b_j = 1\).

  9. This is the same set as \(\bigcup \left \{\sum_{i \in I} t_i C_i \right \}\) except for notational differences.

Corollary 9.3 (Convex hull of union)

Let \(C_1\) and \(C_2\) be convex sets. Let \(C\) be the convex hull of their union:

\[ C = \ConvexHull (C_1 \cup C_2). \]

Then,

\[ C = \bigcup_{t \in [0,1]} \left [ (1 - t) C_1 + t C_2 \right ]. \]

9.2.11.4. Partial Addition#

Recall the notion of direct sum of two vector spaces \(\VV\) and \(\WW\) over \(\RR\) given by \(\VV \oplus \WW\).

Theorem 9.39 (Partial addition on convex sets)

Let \(\VV\) and \(\WW\) be real vector spaces. Let \(\VV \oplus \WW\) be their direct sum. Let \(C_1\) and \(C_2\) be convex sets in \(\VV \oplus \WW\). Let \(C\) be the set of vectors \(\bx = (\by, \bz)\) such that there exist vectors \(\bz_1, \bz_2 \in \WW\) with \((\by, \bz_1) \in C_1\), \((\by, \bz_2) \in C_2\) and \(\bz_1 + \bz_2 = \bz\). Then, \(C\) is a convex set in \(\VV \times \WW\).

Proof. Let \((\by, \bz) \in C\) such that there exist vectors \(\bz_1, \bz_2 \in \WW\) with \((\by, \bz_1) \in C_1\), \((\by, \bz_2) \in C_2\) and \(\bz_1 + \bz_2 = \bz\).

Let \((\by', \bz') \in C\) such that there exist vectors \(\bz'_1, \bz'_2 \in \WW\) with \((\by', \bz'_1) \in C_1\), \((\by', \bz'_2) \in C_2\) and \(\bz'_1 + \bz'_2 = \bz'\).

Let \(t \in [0,1]\). Consider the vector \((\by'', \bz'') = t(\by, \bz) + (1-t)(\by', \bz')\).

  1. \(\by'' = t \by + (1-t) \by'\).

  2. \(\bz'' = t \bz + (1-t) \bz' = t(\bz_1 + \bz_2) + (1-t)(\bz'_1 + \bz'_2)\).

  3. Let \(\bz_1'' = t \bz_1 + (1-t) \bz_1'\).

  4. Let \(\bz_2'' = t \bz_2 + (1-t) \bz_2'\).

  5. Since \((\by, \bz_1), (\by', \bz'_1) \in C_1\) and \(C_1\) is convex, hence \((\by'', \bz''_1) \in C_1\).

  6. Since \((\by, \bz_2), (\by', \bz'_2) \in C_2\) and \(C_2\) is convex, hence \((\by'', \bz''_2) \in C_2\).

  7. But then, we note that \(\bz'' = \bz''_1 + \bz''_2\).

  8. Thus, \((\by'', \bz''_1) \in C_1\) and \((\by'', \bz''_2) \in C_2\) implies that \((\by'', \bz'') \in C\).

  9. Thus, \(C\) is convex.

We can write a version of the theorem above for \(\RR^n\).

Corollary 9.4 (Partial addition on convex sets in Euclidean space)

Let \(C_1\) and \(C_2\) be convex sets in \(\RR^{n + m}\). Let \(C\) be the set of vectors \(\bx = (\by, \bz)\) such that there exist vectors \(\bz_1, \bz_2 \in \RR^m\) with \((\by, \bz_1) \in C_1\), \((\by, \bz_2) \in C_2\) and \(\bz_1 + \bz_2 = \bz\). Then, \(C\) is a convex set in \(\RR^{n+m}\).

The relationship between \(C\) and \(C_1,C_2\) is known as partial addition.

  • When \(\VV = \{ \bzero\}\) we are left with the result that \(C_1, C_2\) convex implies \(C_1 + C_2\) is convex.

  • When \(\WW = \{\bzero\}\), we are left with the result that \(C_1, C_2\) convex implies \(C_1 \cap C_2\) is convex.

  • In between, we have a spectrum of results where for a vector in \(C\), part of the representation must be common between \(C_1\) and \(C_2\) while the remaining part must be the sum of corresponding parts of vectors in \(C_1\) and \(C_2\).

  • In other words, if a vector space can be decomposed as a direct sum of two subspaces, then we have intersection or representation in one subspace while addition in the other.

  • This partial addition (binary) operation is commutative as well as associative.

Partial additions appear naturally in convex cones in \(\VV \oplus \RR\) generated by a convex set in \(\VV\). See Observation 9.3 and discussion thereafter.

9.2.11.5. Cartesian Product/Direct Sum#

Theorem 9.40 (Direct sum of convex sets)

Let \(\VV\) and \(\WW\) be real vector spaces. Let \(C \subseteq \VV\) and \(D \subseteq \WW\) be convex subsets of \(\VV\) and \(\WW\) respectively. Then, \(C \oplus D\) is a convex subset of \(\VV \oplus \WW\).

More generally, if \(\VV_1, \dots, \VV_k\) are real vector spaces and \(C_i \subseteq \VV_i\) are convex subsets for \(i=1,\dots,k\), then \(C = C_1 \oplus \dots \oplus C_k\) is convex in the direct sum of vector spaces \(\VV_1 \oplus \dots \oplus \VV_k\).

Proof. If either \(C\) or \(D\) is empty, then \(C \oplus D\) is empty, hence convex. We shall thus assume that both \(C\) and \(D\) are nonempty.

  1. Let \(\bz_1, \bz_2 \in C \oplus D\) and \(t \in (0, 1)\).

  2. Then, \(\bz_1 = (\bx_1, \by_1)\) and \(\bz_2 = (\bx_2, \by_2)\) such that \(\bx_1, \bx_2 \in C\) and \(\by_1, \by_2 \in D\).

  3. Since \(C\) and \(D\) are convex, hence \(\bx = t \bx_1 + (1-t) \bx_2 \in C\) and \(\by = t \by_1 + (1- t) \by_2 \in D\).

  4. Now,

    \[\begin{split} \bz &= t \bz_1 + (1 - t) \bz_2\\ &= t(\bx_1, \by_1) + (1-t)(\bx_2, \by_2)\\ &= (t \bx_1 + (1-t) \bx_2, t \by_1 + (1- t) \by_2)\\ &= (\bx, \by). \end{split}\]
  5. Since \(\bx \in C\) and \(\by \in D\), hence \(\bz = (\bx, \by)\in C \oplus D\).

  6. Thus, \(C \oplus D\) is closed under convex combination.

  7. Thus, \(C \oplus D\) is convex.

The generalization for multiple real vector spaces is easily verifiable through induction.

9.2.11.6. Projection#

Theorem 9.41 (Projection of a direct sum)

Let \(\VV\) and \(\WW\) be real vector spaces. Let \(C \subseteq \VV\) and \(D \subseteq \WW\). Assume that \(C \oplus D\) is a convex subset of \(\VV \oplus \WW\). Then, \(C\) and \(D\) are convex subsets of \(\VV\) and \(\WW\) respectively.

More generally, if \(\VV_1, \dots, \VV_k\) are real vector spaces and \(C_i \subseteq \VV_i\) are subsets for \(i=1,\dots,k\), such that \(C = C_1 \oplus \dots \oplus C_k\) is convex in the direct sum of vector spaces \(\VV_1 \oplus \dots \oplus \VV_k\); then \(C_i\) are convex subsets of \(\VV_i\) for \(i=1,\dots,k\).

Proof. Consider the case of two vector spaces \(\VV\) and \(\WW\).

  1. Let \(\bx_1, \bx_2 \in C\) and \(t \in (0,1)\).

  2. Pick any \(\by \in D\).

  3. Then, \((\bx_1, \by), (\bx_2, \by) \in C \oplus D\).

  4. Since \(C \oplus D\) is convex, hence

    \[ t (\bx_1, \by) + (1-t) (\bx_2, \by) = (t \bx_1 + (1-t) \bx_2, \by) \in C \oplus D. \]
  5. Thus, \(t \bx_1 + (1-t) \bx_2 \in C\).

  6. Thus, \(C\) is convex.

  7. Similarly \(D\) is also convex.

The argument can be extended by mathematical induction for multiple vector spaces.

Theorem 9.42 (Projection of a convex set)

Let \(\VV\) and \(\WW\) be real vector spaces. Let \(C \subseteq \VV \oplus \WW\) be a convex set of \(\VV \oplus \WW\).

For every \(\bx \in \VV\), define

\[ D_{\bx} = \{ \by \in \WW \ST (\bx, \by) \in C \}. \]

Then \(D_{\bx}\) is convex for every \(\bx \in \VV\).

Similarly, if for every \(\by \in \WW\), we define

\[ E_{\by} = \{ \bx \in \VV \ST (\bx, \by) \in C \} \]

then \(E_{\by}\) is convex for every \(\by \in \WW\).

Proof. If \(D_{\bx}\) is empty then it is convex vacuously. Hence assume that \(D_{\bx}\) is nonempty.

  1. Then for every \(\by \in D_{\bx}\) there exists \((\bx, \by) \in C\).

  2. Let \(\bu, \bv \in D_{\bx}\) and \(t \in [0,1]\).

  3. Then \((\bx, \bu) \in C\) and \((\bx, \bv) \in C\).

  4. Since \(C\) is convex hence \(t(\bx, \bu) + (1-t)(\bx, \bv) \in C\).

  5. i.e., \((\bx, t\bu + (1-t)\bv) \in C\).

  6. This implies that \(t\bu + (1-t)\bv \in D_{\bx}\).

  7. Hence \(D_{\bx}\) is convex.

The argument for the convexity of \(E_{\by}\) is identical.

9.2.12. Extreme Points#

Definition 9.13 (Extreme points of convex sets)

Let \(VV\) be a real vector space and let \(C\) be a subset of \(\VV\).

A point \(\bx \in S\) is called an extreme point of \(S\) if there do not exist \(\bx_1, \bx_2 \in S\) with \(\bx_1 \neq \bx_2\) and \(t \in (0, 1)\) such that

\[ \bx = t \bx_1 + (1-t) \bx_2. \]

In other words, \(\bx\) cannot be expressed as a nontrivial convex combination of two different points in \(S\).

The set of extreme points of a set \(S\) is denoted by \(\extreme S\).

Example 9.7 (Extreme points)

  1. Let \(C = [0,1] \subseteq \RR\). Then, \(0\) and \(1\) are extreme points of \(C\).

  2. Let \(C = (0, 1) \subseteq \RR\). \(C\) doesn’t have any extreme point.

  3. In a triangle, the three vertices are extreme points.

  4. In a convex polytope, all the vertices are extreme points.

A more intricate example of the set of extreme points for the set \(P = \{ \bx \in \RR^n \ST \bA \bx = \bb, \bx \succeq \bzero \}\) is discussed in Theorem 9.59.