9.2. Convex Sets#

Throughout this section, we assume that V is a real vector space. Wherever necessary, it is equipped with a norm :VR or a real inner product ,:V×VR. It is also equipped with a metric d(x,y)=xy.

9.2.1. Line Segments#

Definition 9.5 (Line segment)

Let x1 and x2 be two points in V. Points of the form

y=(1θ)x1+θx2 where 0θ1

form a (closed) line-segment between x1 and x2. The closed line segment is denoted by [x1,x2].

[x1,x2]{(1θ)x1+θx2|0θ1}.

Similarly, we define an open line segment as:

(x1,x2){(1θ)x1+θx2|0<θ<1}.

The half-open segment (x1,x2] is defined as:

(x1,x2]{(1θ)x1+θx2|0<θ1}.

The half-open segment [x1,x2) is defined as:

[x1,x2){(1θ)x1+θx2|0θ<1}.

9.2.2. Convex Sets#

Definition 9.6 (Convex set)

A set CV is convex if the line segment between any two points in C lies in C. i.e.

θx1+(1θ)x2Cx1,x2C and 0θ1.

The empty set is vacuously convex. The entire vector space V is convex since it contains all the line segments between any pair of points in the space.

../_images/pic_convex_sets.png

Fig. 9.1 Examples of convex sets#

Observation 9.2

Since a convex set contains the line segment between any two points, any line segment is convex by definition.

Geometrically, a convex set has no holes, hollows, pits or dimples. A set is convex if from each point in the set, it is possible to see every other point without having the line of sight pass outside the set.

Example 9.1 (Real line)

On the real line R, the empty set, sing points, intervals (closed, open, half open), half lines, and the entire real line are convex sets. There are no other convex sets possible.

Theorem 9.7

Any linear subspace is convex.

Proof. Let E be a linear subspace of V. Then E is closed under addition and scalar multiplication. Thus, for any x,yE and 0t1,

tx+(1t)yE.

Thus, E is convex.

Theorem 9.8

Any affine set is convex.

Proof. Let CV be an affine set. By definition, for any x,yC and any tR, tx+(1t)yC. It is valid in particular for 0t1. Thus, C is convex.

Theorem 9.9

Any hyperplane is convex since it is affine.

Theorem 9.10

Half spaces are convex.

Proof. Consider H+ defined as:

H+={x:a,xb}

Let x,yH+. Then:

a,xb and a,yb.

For some 0t1:

a,tx+(1t)y=ta,x+(1t)a,ytb+(1t)b=b.

Thus, x+(1t)yH+. Analogous proofs apply for other types of half spaces.

Theorem 9.11 (Convex set as convex combination of itself)

Let C be a nonempty subset of V. If C is convex then for every t1,t20, we have

(t1+t2)C=t1C+t2C.

In particular, if t1,t20 such that t1+t2=1, then

C=t1C+t2C.

Proof. The statement (t1+t2)Ct1C+t2C is valid even for sets which are not convex.

  1. Let xt1+t2)C.

  2. Then there exist yC such that x=t1y+t2y.

  3. Hence t1yt1C and t2yt2C.

  4. Hence x=t1y+t2yt1C+t2C.

We now show the converse.

  1. Let xt1C+t2C.

  2. Then there exist x1,x2C such that

    x=t1x1+t2x2.
  3. If t1=t2=0, then x=0 and x(0+0)C.

  4. Now assume that t1+t2>0.

  5. By the convexity of C,

    y=t1t1+t2x1+t2t1+t2x2C.
  6. Hence x=(t1+t2)y(t1+t2)C.

  7. Hence t1C+t2C(t1+t2)C.

Together, we have

(t1+t2)C=t1C+t2C.

Theorem 9.12 (Convex set as union of line segments)

Let C be a convex subset of V. Then C is the union of all the closed line segments connecting the points of the set. In other words

C=x,yC[x,y].

Proof. Let D=x,yC[x,y].

If C is empty, then D is also empty. Hence there is nothing to prove. If C={x} is a singleton, then D consists of the union of a single line segment

[x,x]={x}.

So C=D. We now consider the case where C consists of more than one point.

We first show that CD.

  1. Let xC.

  2. Then [x,x]={x} is a line segment of C.

  3. Hence x[x,x]D.

  4. Hence CD.

We now show the converse.

  1. Let zD.

  2. Then there exists x,yC such that z[x,y].

  3. Then by convexity of C, [x,y]C.

  4. Hence z[x,y]C.

  5. Hence DC.

Together, C=D.

9.2.3. Rays#

Definition 9.7 (Ray)

A ray R is defined as

R{x0+tv|t0}

where v0 indicates the direction of ray and x0 is the base or origin of ray.

Theorem 9.13

A ray is convex.

Proof. Let a ray be given as:

R={x0+tv|t0}.

Let u,vR. Thus, there is tu,tv0 such that:

u=x0+tuv and v=x0+tvv.

Now, for some 0r1,

ru+(1r)v=r(x0+tuv)+(1r)(x0+tvv)=x0+(rtu+(1r)tv)v.

Since rtu+(1r)tv0, hence ru+(1r)vR.

9.2.4. Balls#

Theorem 9.14

An open ball B(a,r) is convex for any norm :VR.

Proof. Recall that an open ball in a normed linear space is defined as:

B(a,r)={xV|xa<r}.

Let x,yB(a,r) and let 0t1. Then,

tx+(1t)ya=t(xa)+(1t)(ya).

By triangle inequality:

t(xa)+(1t)(ya)t(xa)+(1t)(ya)=txa+(1t)ya<tr+(1t)r=r.

Thus,

tx+(1t)ya<rtx+(1t)yB(a,r).

Theorem 9.15

A closed ball B[a,r] is convex for any norm :VR.

Proof. Recall that a closed ball in a normed linear space is defined as:

B[a,r]={xV|xar}.

Let x,yB(a,r) and let 0t1. Then,

tx+(1t)ya=t(xa)+(1t)(ya).

By triangle inequality:

t(xa)+(1t)(ya)t(xa)+(1t)(ya)=txa+(1t)yatr+(1t)r=r.

Thus,

tx+(1t)yartx+(1t)yB[a,r].

9.2.5. Convex Combinations#

Definition 9.8 (Convex combination)

We call a point of the form θ1x1++θkxk, where θ1++θk=1 and θi0,i=1,,k, a convex combination of the points x1,,xk.

It is like a weighted average of the points xi. A weights θi in a convex combination can be interpreted as probabilities or proportions.

Example 9.2 (Center of mass)

Consider a system of particles pi, i=1,,n each with mass mi and location in space as xi. The center of mass x satisfies the equation:

i=1nmi(xix)=0.

Solving this equation gives us:

x=i=1nmimxi

where m=i=1nmi.

If we assign θi=mim, we notice that θi0 and i=1nθi=1. We can now write the center of mass as:

x=i=1nθixi

which is a convex combination of the locations xi where θi gives the proportion of contribution of each particle according to its mass.

Remark 9.2 (Convex combinations and unit simplex)

We recall that the unit simplex in Rn is given by

Δn={tRn|t,1=1,t0}={tRn|t1++tn=1,t1,,tn0}.

Thus, the coefficients for convex combinations of n points are drawn from Δn.

Theorem 9.16 (Closure under convex combinations)

A set is convex if and only if it contains all convex combinations of its points.

Let V be a real vector space and C be a subset of V. Then, C is convex if and only if for any mN, for any x1,,xmC, and for every tΔm, t1x1++tmxmC.

Proof. We know that a set C is convex is equivalent to saying that it contains all 2 point convex combinations; i.e, for any x1,x2C, t1,t20 and t1+t2=1,

t1x1+t2x2C.

We first show that if C is convex, it contains all its (finite) convex combination by induction.

  1. By definition C contains all its 2 point convex combinations.

  2. As induction hypothesis, assume that C contains all convex combinations of m1 or fewer points where m>2.

  3. Consider a convex combination of m points

    i=1mtixi

    where xiC, ti0, ti=1.

  4. Since t1=1, hence at least one of ti<1.

  5. Without loss of generality, assume tm<1.

  6. Note that tm<1 means that 1tm>0.

  7. Define y=i=1m1tixi where ti=ti1tm.

  8. Note that ti0. Also, i=1m1ti=1 since i=1m1ti=1tm.

  9. Thus, y is an m1 point convex combination of C.

  10. By induction hypothesis, yC.

  11. Now, (1tm)y=i=1m1tixi.

  12. Hence, x=(1tm)y+tmxm.

  13. It is a 2 point convex combination of y and xm.

  14. Since both y,xmC, hence xC.

  15. Thus, C contains all its m point convex combinations.

For the converse, note that if C contains all its convex combinations, then it contains, in particular, all its two point convex combinations. Hence, C is convex.

Theorem 9.17

A convex combination of convex combinations is a convex combination.

Proof. Let SV. Note that S is arbitrary (no convexity assumed).

  1. Consider n points yi, i=1,,n described as below.

  2. Let yi=j=1mjti,jxi,j be convex combinations of mj points:

    • xi,1,,xi,mjS.

    • ti,j0.

    • j=1mjti,j=1.

  3. Consider the convex combination y=i=1nriyi.

    • ri0.

    • ri=1.

  4. We need to show that y is a convex combination of points of S.

Towards this:

y=i=1nriyi=i=1nrij=1mjti,jxi,j=i=1nj=1mjriti,jxi,j.

Consider the terms:

si,j=riti,j.

Since ri0 and ti,j0, hence si,j0.

Now, consider their sum:

i=1nj=1mjsi,j=i=1nj=1mjriti,j=i=1nrij=1mjti,j=i=1nri=1

Thus, i,jsi,j=1.

Hence,

y=i,jsi,jxi,j

is a convex combination of points of S.

9.2.6. Convex Hull#

Definition 9.9 (Convex hull)

The convex hull of an arbitrary set SV denoted as conv(S), is the set of all convex combinations of points in S.

conv(S)={θ1x1++θkxk|xkS,θi0,i=1,,k,θ1++θk=1}.
../_images/pic_convex_hull_random_2d_points.png

Fig. 9.2 The convex hull of a set of random points on the 2D plane#

Property 9.1 (Convexity of convex hull)

The convex hull conv(S) of a set S is always convex.

Proof. Let x,yconv(S), t[0,1] and the point z=tx+(1t)y. We need to show that zconv(S).

  1. x,y are convex combinations of points of S.

  2. z is a convex combination of x and y.

  3. Hence, z is a convex combination of convex combinations of points in S.

  4. By Theorem 9.17, a convex combination of convex combinations is a convex combination.

  5. Thus, z is a convex combination of points of S.

  6. But conv(S) contains all convex combinations of points of S by definition.

  7. Hence, zconv(S).

  8. Thus, conv(S) is convex.

Property 9.2 (Affine hull of convex hull)

Let V be a finite dimensional real vector space. Let S be a nonempty subset of V. Let C=conv(S). Then

affS=affC.

In other words, the affine hull of a set and its convex hull are identical.

Proof. By using a translation argument if necessary, without loss of generality, we assume that 0S.

  1. Then both affS and affC are linear subspaces.

  2. Since SC, hence affSaffC.

  3. For the converse, assume that m=affC.

  4. Let x1,,bxmC be m linearly independent vectors spanning affC.

  5. Then for every xaffC, there exist scalars t1,,tm so that

    x=i=1mtixi.
  6. By definition of convex hull, each xiC is a convex combination of points in S.

  7. Hence every xaffC is a linear combination of points in S.

  8. Hence affCaffS.

Theorem 9.18

The convex hull of a set S is the smallest convex set containing it. In other words, let C be any convex set such that SC. Then conv(S)C.

Proof. Let C be a convex set such that SC.

  1. Let xconv(S).

  2. Then, x is a convex combination of points of S.

  3. C is convex and SC.

  4. Hence, C contains every convex combination of points of S.

  5. Thus, in particular xC.

  6. Since xconv(S) was arbitrary, hence conv(S)C.

We could have started as defining the convex hull of S being the smallest convex set containing S and arrived at the conclusion that conv(S) contains all convex combinations of S. Some authors prefer to define conv(S) as the smallest convex set containing S. Both definitions are equivalent.

9.2.6.1. Carathéodory Theorem#

Theorem 9.19 (Carathéodory theorem)

Let V be an n-dimensional real vector space. Let SV. Let xconv(S).

Then, there exists a set of n+1 points x0,,xnS such that

xconv({x0,,xn});

i.e., there exists a tΔn+1 such that

x=i=0ntixi.

Proof. We note that xconv(S).

  1. Thus, there exists a set of k+1 points in S and tΔk+1 such that

    x=i=0ktixi.
  2. We can assume ti>0 for all i=0,,k since otherwise, we can drop the vectors corresponding to the zero coefficients from the convex combination.

  3. If kn, there is nothing to prove.

  4. Hence, consider the case where k>n.

  5. We now describe a process which can reduce the number of points in the convex combination by one.

  6. The k vectors x1x0,,xkx0 are linearly dependent as k>n and V is n-dimensional.

  7. Thus, there is a nontrivial linear combination of these vectors

    r1(x1x0)++rk(xkx0)=0.
  8. Let r0=r1rk. Then, we have a nontrivial combination

    i=0krixi=0

    with ri=0.

  9. In particular, there exists at least one index j for which rj<0.

  10. Let α0.

  11. Then,

    x=i=0ktixi+αi=0krixi=i=0k(ti+αri)xi

    with (ti+αri)=ti+αri=1.

  12. Thus, it is a convex combination for x if ti+αri0 for every i=0,,k.

  13. Let

    α=mini|ri<0{tiri}.
  14. α is well defined since there is at least one rj<0. Let j be the index for which α was obtained.

  15. Then, tj+αrj=0.

  16. Also, we can see that ti+αri0 for all i=0,,k.

  17. Thus, we have found a convex combination for x where the coefficient for xj is 0.

  18. Thus, we have obtained a convex combination for x with k1 points.

  19. Repeating this process up to kn times, we can obtain a convex combination of x consisting of n+1 or less points.

9.2.7. Dimension#

Definition 9.10 (Dimension of a convex set)

The dimension of a convex set is defined to be the dimension of its affine hull.

If C is a convex set, then:

dimC=dimaffC.

Recall that the dimension of an affine set is equal to the dimension of the linear subspace associated with it (Definition 4.163).

  1. A circle will have a dimension of 2 even if it is in R3.

  2. A sphere will have a dimension of three.

9.2.8. Simplices#

Theorem 9.20 (Convex hull of a finite set of points)

Let S={x0,,xm} be a finite set of points of V. Then, conv(S) consists of all the points of the form

t0x0+tmxm,t00,,tm0,i=0mti=1.

In Rn, this is known as a polytope.

A Simplex is a convex hull of a finite set of affine independent points. The simplex provides a powerful coordinate system for the points within it in terms of barycentric coordinates.

Definition 9.11 (k-simplex)

Let k+1 points v0,,vkV be affine independent.

The simplex determined by them is given by

C=conv{v0,,vk}={t0v0++tkvk|t0,1Tt=1}

where t=[t1,,tk]T and 1 denotes a vector of appropriate size (k) with all entries one.

In other words, C is the convex hull of the set {v0,,vk}.

A simplex is a convex set since it is a convex hull of its vertices. k stands for the dimension of the simplex. Recall that the dimension of a convex set is the dimension of its affine hull.

Example 9.3 (Simplex examples)

In Rn:

  • A 0-simplex is a point.

  • A 1-simplex is a line segment (2 points).

  • A 2-simplex is a triangle (3 points).

  • A 3-simplex is a tetrahedron (4 points).

  • A 4-simplex is a 5-cell (5 points).

Theorem 9.21 (Barycentric coordinates)

Each point of a k simplex is uniquely expressible as a convex combination of the vertices.

Proof. Let C=conv{v0,v1,,vk}.

  1. Let vC.

  2. Then, v=i=0ktivi with ti0 and ti=1.

  3. For contradiction, assume there was another representation: v=i=0krivi with ri0 and ri=1.

  4. Then,

    i=0ktivi=i=0krivii=0k(tiri)vi=0.
  5. But {v0,v1,,vk} are affine independent.

  6. Hence, ti=ri.

  7. Thus, the representation is unique.

Definition 9.12 (Simplex midpoint)

The point i=0k1k+1vi in a simplex C=conv{v0,,vk} is known as its midpoint or barycenter.

Theorem 9.22

The dimension of a convex set C is the maximum of the dimensions of the various simplices included in C.

Proof. We need to show that there is a simplex SC such that dimS=dimC.

  1. Let A be any finite affine independent subset of C.

  2. Since C is convex, hence ACconv(A)C.

  3. Thus, C contains the simplices constructed from any set of finite affine independent points in C.

  4. Thus, if A={v0,,vk} is a set of k+1 affine independent points of C, then conv(A)C implies that kdimC.

  5. Thus, if S is a k-simplex such that SC, then dimS=kdimC.

  6. Let m be the maximum of the dimensions of the various simplices contained in C.

  7. Then, there exist affine independent points v0,,vmC such that the simplex S=conv{v0,,vm}C.

  8. Let M be the affine hull of S; i.e. M=affS.

  9. Then, dimM=m and MaffC.

  10. If CM were nonempty, then there would be an element vCM which would be affine independent of {v0,,vm}.

  11. That would lead to a set of m+2 affine independent points in C. That would mean that C contains a simplex of dimension m+1. A contradiction.

  12. Hence, CM=.

  13. Thus, CM.

  14. Since affC is the smallest affine set that contains C, hence affC=M.

  15. Thus, dimC=m.

9.2.9. Symmetric Reflections#

The symmetric reflection of a convex set is convex since convexity is preserved under scalar multiplication. See Theorem 9.30 below.

If a symmetric convex set contains a nonzero vector x, then it contains the entire line segment between x and x.

9.2.10. Infinite Convex Combinations#

We can generalize convex combinations to include infinite sums.

Theorem 9.23

Let θ1,θ2, satisfy

θi0,i=1,2,,i=1θi=1,

and let x1,x2,C, where CV is convex. Then

i=1θixiC,

if the series converges.

We can generalize it further to density functions.

Theorem 9.24

Let p:VR satisfy p(x)0 for all xC and

Cp(x)dx=1

Then

Cp(x)xdxC

provided the integral exists.

Note that p above can be treated as a probability density function if we define p(x)=0xVC.

9.2.11. Convexity Preserving Operations#

In the following, we will discuss several operations which transform a convex set into another convex set, and thus preserve convexity.

Understanding these operations is useful for determining the convexity of a wide variety of sets.

Usually, it is easier to prove that a set is convex by showing that it is obtained by a convexity preserving operation from a convex set compared to directly verifying the convexity property i.e.

tx1+(1t)x2Cx1,x2C,t[0,1].

9.2.11.1. Intersection and Union#

Theorem 9.25 (Intersection of convex sets)

If S1 and S2 are convex sets then S1S2 is convex.

Proof. Let x1,x2S1S2. We have to show that

tx1+(1t)x2S1S2,t[0,1].

Since S1 is convex and x1,x2S1, hence

tx1+(1t)x2S1,t[0,1].

Similarly

tx1+(1t)x2S2,t[0,1].

Thus

tx1+(1t)x2S1S2,t[0,1].

which completes the proof.

We can generalize it further.

Theorem 9.26 (Intersection of arbitrary collection of convex sets)

Let {Ai}iI be a family of sets such that Ai is convex for all iI. Then iIAi is convex.

Proof. Let x1,x2 be any two arbitrary elements in iIAi.

x1,x2iIAix1,x2AiiItx1+(1t)x2Ait[0,1]iI since Ai is convex tx1+(1t)x2iIAi.

Hence iIAi is convex.

Corollary 9.1 (Arbitrary intersection of closed half spaces)

Let I be an index set. Let aiV and biR for every iI. Then, the set:

C={xV|x,aibiiI}

is convex.

Proof. Since each of the half spaces is convex, hence so is their intersection.

This result is applicable for open half spaces and hyperplanes too too. It also applies for a mixture of hyperplanes and half-spaces.

Corollary 9.2

The solution set of a system of linear equations and inequalities in Rn is convex.

Proof. We proceed as follows:

  • The solution set of each linear equation is a hyperplane.

  • The solution set of each linear inequality is a half-space (closed or open).

  • The solution set of a system of linear equations and inequalities is the intersection of these hyperplanes and half-spaces.

  • Each hyperplane and each half-space is convex.

  • Hence, their intersection is convex.

Theorem 9.27 (Intersection and union of two sets)

Let C1 and C2 be convex in V. Then, the largest convex set contained in both is C1C2. And, the smallest convex set containing both is conv(C1C2).

Proof. Let C be a convex set contained in both C1 and C2. Then, CC1C2. But C1C2 is convex (Theorem 9.25). Hence, C1C2 is the largest convex set contained in both C1 and C2.

Let C be a convex set which contains both C1 and C2. Then, C1C2C. The smallest convex set containing C1C2 is its convex hull given by conv(C1C2) (Theorem 9.18).

Theorem 9.28 (Intersection and union of arbitrary sets)

Let I be an index set and F={Ci}iI be a family of convex sets in V. Then, the largest convex set contained in every set of F is:

iICi.

And, the smallest convex set containing every set of F is

conv(iICi).

Proof. Let C be a convex set contained in every set of F. Then, CiICi. But iICi is convex (Theorem 9.26). Hence, iICi is the largest convex set contained in every set of F.

Let C be a convex set which contains every set of F. Then, iICiC. The smallest convex set containing every set of F is its convex hull given by conv(iICi) (Theorem 9.18).

9.2.11.2. Affine Functions#

Let us start with some simple results.

Theorem 9.29 (Convexity and translation)

Convexity is preserved under translation.

C (a subset of V) is convex if and only if C+a is convex for every aV.

Proof. Let CV.

  1. Assume C is convex.

  2. Then, for every x,yC and every t[0,1], tx+(1t)yC.

  3. Let aV.

  4. Let u,vC+a.

  5. Then, u=x+a and v=y+a for some x,yC.

  6. Then,

    tu+(1t)v=t(x+a)+(1t)(y+a)=tx+(1t)y+a.
  7. But tx+(1t)yC since C is convex.

  8. Then, tx+(1t)y+aC+a.

  9. Thus, tu+(1t)vC+a.

  10. Thus, C+a is convex.

We can follow the same argument in the opposite direction to establish that C+a is convex implies C is convex.

Theorem 9.30 (Convexity and scalar multiplication)

Convexity is preserved under scalar multiplication.

C (a subset of V) is convex if and only if αC is convex for every αR.

Proof. Let CV.

  1. Assume C is convex.

  2. Let αR.

  3. Let u,yαC.

  4. Then, u=αx and v=αy for some x,yC.

  5. Let t[0,1].

  6. tu+(1t)v=α(tx+(1t)y).

  7. But tx+(1t)yC since C is convex.

  8. Hence, α(tx+(1t)y) in αC.

  9. Hence, tu+(1t)vαC.

  10. Thus, αC is convex.

Similar argument in opposite direction establishes that αC is convex implies C is convex.

Recall that an affine function f:VE from a real vector space V to another real vector space E is a function which satisfies

f(tx+(1t)y)=tf(x)+(1t)f(y)

for every tR.

Recall from Theorem 4.193 that an affine function can be written as a linear transformation followed by a translation:

f(x)=T(x)+b

where T is a linear operator.

Example 9.4

An affine function f:RnRm takes the form of a matrix multiplication plus a vector addition:

f(x)=Ax+b

where ARm×n and bRm.

Theorem 9.31 (Image under affine function)

Let SV be convex and f:VE be an affine function. Then the image of S under f given by

f(S)={f(x)|xS}

is a convex set.

Proof. We proceed as follows:

  1. Let u,vf(S).

  2. Then, u=f(x) and v=f(y) for some x,yS.

  3. Let 0t1.

  4. Then, z=tx+(1t)yS since S is convex.

  5. Since f is affine, hence

    f(z)=f(tx+(1t)y)=tf(x)+(1t)f(y)=tu+(1t)v.
  6. Since zS, hence f(z)=tu+(1t)vf(S).

  7. We have shown that for any u,vf(S) and any 0t1, tu+(1t)vf(S).

  8. Thus, f(S) is convex.

It applies in the reverse direction also.

Theorem 9.32 (Inverse image under affine function)

Let f:VE be affine and SE be convex. Then the inverse image of S under f given by

f1(S)={xV|f(x)S}

is convex.

Proof. Denote R=f1(S). We need to show that if S is convex then R is convex too.

We proceed as follows:

  1. Let x,yR.

  2. Let u=f(x) and v=f(y).

  3. u,vS.

  4. Let 0t1.

  5. Then, w=tu+(1t)vS since S is convex.

  6. Let z=tx+(1t)y.

  7. Since f is affine, hence

    w=tu+(1t)v=tf(x)+(1t)f(y)=f(tx+(1t)y)=f(z).
  8. Since wS, hence zR as w=f(z).

  9. We have shown that for any x,yR and any 0t1, tx+(1t)yR.

  10. Thus, R is convex.

Example 9.5 (Affine functions preserving convexity)

Let SRn be convex.

  • For some αR, αS is convex. This is the scaling operation.

  • For some aRn, S+a is convex. This is the translation operation.

  • Let n=m+k. Then, let Rn=Rm×Rk. A vector xS can be written as x=(x1,x2) where x1Rm and x2Rk. Then

    T={x1Rm|(x1,x2)S for some x2Rk}

    is convex. This is the projection operation. It projects vectors from Rn to Rm by dropping last k entries.

Example 9.6 (System of linear equations)

Consider the system of linear equations Ax=y where ARm×n.

If y ranges over a convex set, then the corresponding set of solutions also ranges over a convex set due to Theorem 9.32.

The nonnegative orthant R+n is a convex set.

Let Y=R+m+a for some aRm; i.e.,

Y={y|ya}.

Then, A1Y is the set of vectors satisfying the inequality

Axa.

Thus, the solution set of a system of linear inequalities of the form Axa is convex.

Now, if X=R+n, then AX is the set of vectors yRm such that the equation Ax=y has a nonnegative solution (x0). Since X is convex, so is AX.

Theorem 9.33 (Orthogonal projection of convex set)

The orthogonal projection of a convex set C on a subspace V is another convex set.

Proof. We recall that orthogonal projection is a linear mapping and thus an affine function. By Theorem 9.31, image of a convex set under an affine function is convex. Hence proved.

9.2.11.3. Set Addition#

Theorem 9.34 (Convexity and set addition)

Let C1 and C2 be two convex subsets of V. Then C1+C2 is convex.

Proof. We proceed as follows:

  1. Let x,yC1+C2.

  2. Then, x=x1+x2 for some x1C1 and some x2C2.

  3. Similarly, y=y1+y2 for some y1C1 and some y2C2.

  4. Let 0t1.

  5. Then:

    tx+(1t)y=t(x1+x2)+(1t)(y1+y2)=tx1+(1t)y1+tx2+(1t)y2.
  6. But, z1=tx1+(1t)y1C1 since C1 is convex.

  7. Similarly, z2=tx2+(1t)y2C2 since C2 is convex.

  8. Hence, tx+(1t)y=z1+z2C1+C2.

  9. Thus, C1+C2 is convex.

One way to think geometrically about set addition is as the union of all translates of C1 given by C1+x as x varies over C2.

Theorem 9.35

A set C is convex if and only if

(1t)C+tC=Ct[0,1].

Proof. Assume C is convex:

  1. (1t)C+tC={tx+(1t)y|x,yC}.

  2. Thus, (1t)C+tCC.

  3. For every xC, (1t)x(1t)C and txtC.

  4. Thus, (1t)x+tx=x(1t)C+tC.

  5. Thus, C(1t)C+tC.

  6. Combining, we get (1t)C+tC=C.

Assume (1t)C+tC=C for every t[0,1].

  1. Let x,yC and t[0,1].

  2. Then, (1t)x(1t)C and tytC.

  3. Hence, (1t)x+ty(1t)C+tC=C.

  4. Thus, C is convex.

Theorem 9.36 (Convexity and linear combination)

Convexity is preserved under linear combinations.

Let C1,,Ck be convex. Let t1,,tkR. Then, their linear combination:

C=t1C1++tkCk

is convex.

Proof. Due to Theorem 9.30, tiCi are convex for i=1,,k.

By (finite) repeated application of Theorem 9.34, their sum is also convex.

Theorem 9.37 (Nonnegative scalar multiplication distributive law)

Let C be convex and t1,t20. Then

(t1+t2)C=t1C+t2C.

Proof. From Theorem 4.16, we know that:

(t1+t2)Ct1C+t2C.

We now show that t1C+t2C(t1+t2)C.

  1. If both t1=t2=0, then we have trivial equality.

  2. If either of t1 or t2 is 0, then also we have trivial equality.

  3. Now, consider the case t1,t2>0.

  4. Define t=t1+t2>0 and r=t1t.

  5. Then, 1r=t2t.

  6. Then, since C is convex, hence rC+(1r)CC.

  7. Multiplying by t on both sides, we get: t1C+t2C(t1+t2)C.

For the special case of t1=r and t2=1r with r[0,1], we get:

C=rC+(1r)C.

Some implications are C+C=2C, C+C+C=3C and so forth if C is convex.

Theorem 9.38 (Convex combinations over arbitrary unions)

Let I be an index set and F={Ci}iI be a family of convex sets in V. Let C be given as:

C=conv(iICi);

i.e., C is the convex hull of the union of the family of sets F. Then,

C={iItiCi}

where the union is taken over all finite convex combinations (i.e. over all nonnegative choices of ti such that only finitely many ti are nonzero and they add up to 1).

Proof. We proceed as follows:

  1. Let xC.

  2. Then, x is a convex combination of elements in iICi.

  3. Thus, x=i=1maiyi where yiiICi, ai0 and ai=1.

  4. Drop all the terms from x where ai=0.

  5. If yi,yj belong to some same Ck, then, we can replace aiyi+ajyj with some ay=aiyi+ajyj where a=ai+aj. Note that, with these assumptions, y=aiai+ajyi+ajai+ajyj is a convex combination of yi and yj, hence yCk since Ck is convex.

  6. Thus, terms from a single Ck in x can be reduced by a single term.

  7. Thus, we can simplify x such that

    x=j=1pbjxj

    such that each xj belongs to a different Cij with ijI, bj>0 and bj=1.

  8. Thus, C is a union of finite convex combinations of the form

    b1Ci1++bpCip

    where i1,,ipI are different indices and bj>0, bj=1.

  9. This is the same set as {iItiCi} except for notational differences.

Corollary 9.3 (Convex hull of union)

Let C1 and C2 be convex sets. Let C be the convex hull of their union:

C=conv(C1C2).

Then,

C=t[0,1][(1t)C1+tC2].

9.2.11.4. Partial Addition#

Recall the notion of direct sum of two vector spaces V and W over R given by VW.

Theorem 9.39 (Partial addition on convex sets)

Let V and W be real vector spaces. Let VW be their direct sum. Let C1 and C2 be convex sets in VW. Let C be the set of vectors x=(y,z) such that there exist vectors z1,z2W with (y,z1)C1, (y,z2)C2 and z1+z2=z. Then, C is a convex set in V×W.

Proof. Let (y,z)C such that there exist vectors z1,z2W with (y,z1)C1, (y,z2)C2 and z1+z2=z.

Let (y,z)C such that there exist vectors z1,z2W with (y,z1)C1, (y,z2)C2 and z1+z2=z.

Let t[0,1]. Consider the vector (y,z)=t(y,z)+(1t)(y,z).

  1. y=ty+(1t)y.

  2. z=tz+(1t)z=t(z1+z2)+(1t)(z1+z2).

  3. Let z1=tz1+(1t)z1.

  4. Let z2=tz2+(1t)z2.

  5. Since (y,z1),(y,z1)C1 and C1 is convex, hence (y,z1)C1.

  6. Since (y,z2),(y,z2)C2 and C2 is convex, hence (y,z2)C2.

  7. But then, we note that z=z1+z2.

  8. Thus, (y,z1)C1 and (y,z2)C2 implies that (y,z)C.

  9. Thus, C is convex.

We can write a version of the theorem above for Rn.

Corollary 9.4 (Partial addition on convex sets in Euclidean space)

Let C1 and C2 be convex sets in Rn+m. Let C be the set of vectors x=(y,z) such that there exist vectors z1,z2Rm with (y,z1)C1, (y,z2)C2 and z1+z2=z. Then, C is a convex set in Rn+m.

The relationship between C and C1,C2 is known as partial addition.

  • When V={0} we are left with the result that C1,C2 convex implies C1+C2 is convex.

  • When W={0}, we are left with the result that C1,C2 convex implies C1C2 is convex.

  • In between, we have a spectrum of results where for a vector in C, part of the representation must be common between C1 and C2 while the remaining part must be the sum of corresponding parts of vectors in C1 and C2.

  • In other words, if a vector space can be decomposed as a direct sum of two subspaces, then we have intersection or representation in one subspace while addition in the other.

  • This partial addition (binary) operation is commutative as well as associative.

Partial additions appear naturally in convex cones in VR generated by a convex set in V. See Observation 9.3 and discussion thereafter.

9.2.11.5. Cartesian Product/Direct Sum#

Theorem 9.40 (Direct sum of convex sets)

Let V and W be real vector spaces. Let CV and DW be convex subsets of V and W respectively. Then, CD is a convex subset of VW.

More generally, if V1,,Vk are real vector spaces and CiVi are convex subsets for i=1,,k, then C=C1Ck is convex in the direct sum of vector spaces V1Vk.

Proof. If either C or D is empty, then CD is empty, hence convex. We shall thus assume that both C and D are nonempty.

  1. Let z1,z2CD and t(0,1).

  2. Then, z1=(x1,y1) and z2=(x2,y2) such that x1,x2C and y1,y2D.

  3. Since C and D are convex, hence x=tx1+(1t)x2C and y=ty1+(1t)y2D.

  4. Now,

    z=tz1+(1t)z2=t(x1,y1)+(1t)(x2,y2)=(tx1+(1t)x2,ty1+(1t)y2)=(x,y).
  5. Since xC and yD, hence z=(x,y)CD.

  6. Thus, CD is closed under convex combination.

  7. Thus, CD is convex.

The generalization for multiple real vector spaces is easily verifiable through induction.

9.2.11.6. Projection#

Theorem 9.41 (Projection of a direct sum)

Let V and W be real vector spaces. Let CV and DW. Assume that CD is a convex subset of VW. Then, C and D are convex subsets of V and W respectively.

More generally, if V1,,Vk are real vector spaces and CiVi are subsets for i=1,,k, such that C=C1Ck is convex in the direct sum of vector spaces V1Vk; then Ci are convex subsets of Vi for i=1,,k.

Proof. Consider the case of two vector spaces V and W.

  1. Let x1,x2C and t(0,1).

  2. Pick any yD.

  3. Then, (x1,y),(x2,y)CD.

  4. Since CD is convex, hence

    t(x1,y)+(1t)(x2,y)=(tx1+(1t)x2,y)CD.
  5. Thus, tx1+(1t)x2C.

  6. Thus, C is convex.

  7. Similarly D is also convex.

The argument can be extended by mathematical induction for multiple vector spaces.

Theorem 9.42 (Projection of a convex set)

Let V and W be real vector spaces. Let CVW be a convex set of VW.

For every xV, define

Dx={yW|(x,y)C}.

Then Dx is convex for every xV.

Similarly, if for every yW, we define

Ey={xV|(x,y)C}

then Ey is convex for every yW.

Proof. If Dx is empty then it is convex vacuously. Hence assume that Dx is nonempty.

  1. Then for every yDx there exists (x,y)C.

  2. Let u,vDx and t[0,1].

  3. Then (x,u)C and (x,v)C.

  4. Since C is convex hence t(x,u)+(1t)(x,v)C.

  5. i.e., (x,tu+(1t)v)C.

  6. This implies that tu+(1t)vDx.

  7. Hence Dx is convex.

The argument for the convexity of Ey is identical.

9.2.12. Extreme Points#

Definition 9.13 (Extreme points of convex sets)

Let VV be a real vector space and let C be a subset of V.

A point xS is called an extreme point of S if there do not exist x1,x2S with x1x2 and t(0,1) such that

x=tx1+(1t)x2.

In other words, x cannot be expressed as a nontrivial convex combination of two different points in S.

The set of extreme points of a set S is denoted by extS.

Example 9.7 (Extreme points)

  1. Let C=[0,1]R. Then, 0 and 1 are extreme points of C.

  2. Let C=(0,1)R. C doesn’t have any extreme point.

  3. In a triangle, the three vertices are extreme points.

  4. In a convex polytope, all the vertices are extreme points.

A more intricate example of the set of extreme points for the set P={xRn|Ax=b,x0} is discussed in Theorem 9.59.