9.8. Convex Functions#

Throughout this section, we assume that V,W are real vector spaces. Wherever necessary, they are equipped with a norm or an real inner product ,. They are also equipped with a metric d(x,y)=xy as needed.

We suggest the readers to review the notions of graph, epigraph, sublevel sets of real valued functions in Real Valued Functions. Also pay attention to the notion of extended real valued functions, their effective domains, graphs and level sets.

9.8.1. Convexity of a Function#

Definition 9.43 (Convex function)

Let V be a real vector space. A real valued function f:VR is convex if domf is a convex set and for all x1,x2domf, and t[0,1], we have:

(9.1)#f(tx1+(1t)x2)tf(x1)+(1t)f(x2).

An extended valued function f:VR is convex if domf is a convex set and for every x1,x2V, and t[0,1], we have:

f(tx1+(1t)x2)tf(x1)+(1t)f(x2).
../_images/convex_function.png

Fig. 9.3 Graph of a convex function. The line segment between any two points on the graph lies above the graph.#

For a convex function, every chord lies above the graph of the function.

9.8.1.1. Strictly Convex Functions#

Definition 9.44 (Strictly convex function)

Let V be a real vector space. A convex function f:VR is strictly convex if for all x1,x2domf, where x1 and x2 are distinct, and t(0,1), we have:

f(tx1+(1t)x2)<tf(x1)+(1t)f(x2).

In other words, the inequality is a strict inequality whenever the point x=tx1+(1t)x2 is distinct from x1 and x2 both.

9.8.1.2. Concave Functions#

Definition 9.45 (Concave function)

We say that a function f is concave if f is convex. A function f is strictly concave if f is strictly convex.

Example 9.25 (Linear functional)

A linear functional on V in an inner product space has the form of an inner product parameterized by a vector aV as

fa(x)x,axV.

i.e. the inner product of x with a.

Theorem 9.67

All linear functionals on a real vector space are convex as well as concave.

Proof. For any x,yV and t[0,1]:

fa(tx+(1t)y)=tx+(1t)y,a=tx,a+(1t)x,ainner products are linear=tfa(x)+(1t)fa(y).

Thus, fa is convex. We can also see that fa is convex too. Hence, fa is concave too.

9.8.1.3. Arithmetic Mean#

Example 9.26 (Arithmetic mean is convex and concave)

Let f:RnR be given by:

f(x)=x1++xnn.

Arithmetic mean is a linear function. In fact

f(αx+βy)=1n(i=1n(αxi+βyi))=αx1++xnn+βy1++ynn=αf(x)+βf(y)

for all α,βR.

Thus, for t[0,1]

f(tx+(1t)y)=tf(x)+(1t)f(y).

Thus, arithmetic mean is both convex and concave.

9.8.1.4. Affine Function#

Example 9.27 (Affine functional)

An affine functional is a special type of affine function which maps a vector from V to a scalar in the associate field F.

On real vector spaces, an affine functional on V is defined in terms of a vector aV and a scalar bR as

fa,b(x)x,a+bxV

i.e. the inner product of x with a followed by a translation by the amount of b.

Theorem 9.68

All affine functionals on a real vector space are convex as well as concave.

Proof. A simple way to show this is to recall that for any affine function T,

T(tx+(1t)y)=tT(x)+(1t)T(y)

holds true for any x,yV and tF.

In the particular case of real affine functionals, for any x,yV and t[0,1], the previous identity reduces to:

fa,b(tx+(1t)y)=tfa,b(x)+(1t)fa,b(y).

This establishes that fa,b is both convex and concave.

Another way to prove this is by following the definition of fa,b above:

fa,b(tx+(1t)y)=tx+(1t)y,a+b=tx,a+(1t)x,a+b=tx,a+(1t)x,a+tb+(1t)b=(tx,a++tb)+((1t)x,a+(1t)b)=t(x,a+b)+(1t)(x,a+b)=tfa,b(x)+(1t)fa,b(y).

9.8.1.5. Absolute Value#

Example 9.28 (Absolute value is convex)

Let f:RR be:

f(x)=|x|.

with domf=R.

../_images/func_abs_value_1d.png

Recall that |x| is a norm on the real line R. Thus, it satisfies the triangle inequality:

|x+y||x|+|y|x,yR.

In particular, for any t[0,1]

|tx+(1t)y||tx|+|(1t)y|=|t||x|+|1t||y|=t|x|+(1t)|y|.

Thus f satisfies the convexity defining inequality (9.1):

f(tx+(1t)y)tf(x)+(1t)f(y)x,yR.

Hence, f is convex.

9.8.1.6. Norms#

Theorem 9.69 (All norms are convex)

Let :VR be a norm on a real vector space V. Then, it satisfies the triangle inequality:

x+yx+yx,yV.

In particular, for any t[0,1]

tx+(1t)ytx+(1t)y=|t|x+|1t|y=tx+(1t)y.

Thus f satisfies the convexity defining inequality (9.1). Hence, f is convex.

../_images/func_l2_norm_r2_contour3d.png

Fig. 9.4 2 norm for R2. 2:R2R 3D contour plots.#

../_images/func_l2_norm_r2_contour2d.png

Fig. 9.5 2 norm for R2. 2:R2R 2D contour plots.#

../_images/func_l1_norm_r2_contour3d.png

Fig. 9.6 1 norm for R2. 1:R2R 3D contour plots.#

../_images/func_l1_norm_r2_contour2d.png

Fig. 9.7 1 norm for R2. 2:R1R 2D contour plots.#

9.8.1.7. Max Function#

Example 9.29 (Max function is convex)

Let f:RnR be:

f(x)=max{x1,,xn}.

with domf=Rn.

Let x,yRn and t[0,1].

  1. Let x=f(x)=max{x1,,xn} and y=f(y)=max{y1,,yn}.

  2. Then, for every k=1,,n, we have: xkx and yky.

  3. Thus, txk+(1t)yktx+(1t)y.

  4. Taking maximum on the left hand side, we obtain:

    maxk=1n(txk+(1t)yk)tx+(1t)y.
  5. But f(tx+(1t)y)=maxk=1n(txk+(1t)yk).

  6. Thus, f satisfies the convexity defining inequality (9.1):

    f(tx+(1t)y)tf(x)+(1t)f(y).
  7. Thus, f is convex.

9.8.1.8. Geometric Mean#

Example 9.30 (Geometric mean is concave)

Let f:RnR be given by:

f(x)=(i=1nxi)1n

with domf=R++n.

Recall from AM-GM inequality:

(i=1nai)1n1n(i=1nai).
  1. Let x,yR++n and t[0,1].

  2. Let z=tx+(1t)y.

  3. Note that zR++n too, hence every component of z is positive.

  4. With ai=xizi, and applying AM-GM inequality:

    f(x)f(z)=(i=1nxizi)1n1n(i=1nxizi).
  5. With ai=yizi, and applying AM-GM inequality:

    f(y)f(z)=(i=1nyizi)1n1n(i=1nyizi).
  6. Multiplying the first inequality by t and the second by (1t) and adding them, we get:

    tf(x)+(1t)f(y)f(z)1n(i=1ntxizi+(1t)yizi).
  7. Recall that zi=txi+(1t)yi. Thus, the inequality above simplifies to:

    tf(x)+(1t)f(y)f(z)1tf(x)+(1t)f(y)f(z)=f(tx+(1t)y).
  8. Thus, f is concave.

9.8.1.9. Powers#

Example 9.31 (Powers of absolute value)

Let f:RR be:

f(x)=|x|p

with domf=R.

For p=1, the absolute value function f(x)=|x| is convex.

Consider the case where p>1. Let q be its conjugate exponent given by 1p+1q=1.

Let x,yR and s,t(0,1) with s+t=1.

  1. By triangle inequality:

    |sx+ty|s|x|+t|y|.
  2. We can write this as:

    s|x|+t|y|=(s1p|x|)s1q+(t1p|y|)t1q.
  3. By Hölder's inequality, for some a1,a2,b1,b20

    a1b1+a2b2(a1p+a2p)1p+(b1q+b2q)1q.
  4. Let a1=s1p|x|, a2=t1p|y|, b1=s1q and b2=t1q.

  5. Applying Hölder’s inequality,

    |sx+ty|s|x|+t|y|(s|x|p+t|y|p)1p(s+t)1q=(s|x|p+t|y|p)1p.
  6. Taking power of p on both sides, we get:

    |sx+ty|ps|x|p+t|y|p.
  7. Which is the same as

    f(sx+ty)sf(x)+tf(y).
  8. Thus, f is convex.

9.8.1.10. Empty Function#

Observation 9.4 (Empty function is convex)

Let f:VR be such that domf=. Then f is convex.

Proof. The empty set domf is vacuously convex. The defining inequality (9.1) is vacuously true for f since its domain is empty.

This observation may sound uninteresting however it is important in the algebra of convex functions. Definition 2.48 provides definitions for sum of two partial functions and scalar multiplication with a partial function. If two convex functions f and g are such that domfdomg=, then their sum f+g is empty function. The observation above says that f+g is still a convex function.

Jensen's inequality, discussed later in the section, generalizes the notion of convexity of a function to arbitrary convex combinations of points in its domain.

9.8.2. Convexity on Lines in Domain#

Theorem 9.70 (f is convex = f is convex on lines in domain)

Let V be a real vector space. A function f:VR is convex if and only if for any xdomf and any vV, the function g(t)=f(x+tv) is convex (on its domain , {tR|x+tvdomf}).

In other words, f is convex if and only if it is convex when restricted to any line that intersects its domain.

Proof. For any arbitrary f, and for any xdomf,vV
we have defined g:RR as:

g(t)f(x+tv)

with the domain:

domg={t|x+tvdomf}.

Assume f to be convex. To show that g is convex, we need to show that domg is convex and g satisfies the convexity inequality.

  1. Let t1,t2domg.

  2. It means that there are y=x+t1vdomf and z=x+t2vdomf,

  3. and g(t1)=f(y) and g(t2)=f(z).

  4. Let r[0,1] and t=rt1+(1r)t2.

  5. Note that:

    x+tv=x+(rt1+(1r)t2)v=rx+(1r)x+rt1v+(1r)t2v=r(x+t1v)+(1r)(x+t2v)=ry+(1r)z.
  6. Since domf is convex, hence ry+(1r)zdomf.

  7. Thus, g is defined at t=rt1+(1r)t2 for all r[0,1].

  8. We have shown that if t1,t2domg, then t=rt1+(1r)t2domg for all r[0,1].

  9. Thus, domg is convex.

  10. Now,

    g(t)=g(rt1+(1r)t2)=f(ry+(1r)z)rf(y)+(1r)f(z)=rg(t1)+(1r)g(t2).
  11. We showed that for any t1,t2domg and r[0,1],

    g(rt1+(1r)t2)rg(t1)+(1r)g(t2).
  12. Thus, g is convex.

For the converse, we need to show that if for any xdomf and any vV, g is convex, then f is convex.

Assume for contradiction that f is not convex. Then, either domf is not convex or there exists some x,ydomf and t[0,1] such that:

f((1t)x+ty)>(1t)f(x)+tf(y).

We show that both of these conditions lead to contradictions.

  1. Assume domf is not convex.

  2. Then, there exists x,ydomf such that for some t[0,1]:

    (1t)x+tydomf.
  3. Let v=yx.

  4. Note that (1t)x+ty=x+tv.

  5. Picking g for this particular choice of x and v, we note that g is defined at t=0 and t=1 since they corresponding to the points x and y respectively in domf.

  6. Since g is convex by hypothesis, hence, g is defined at every t[0,1].

  7. This contradicts the fact that g is not defined at some t[0,1].

  8. Thus, domf must be convex.

We now accept that domf is convex, but there exists some x,ydomf and t[0,1] such that:

f((1t)x+ty)>(1t)f(x)+tf(y).
  1. Again, let v=yx and note that (1t)x+ty=x+tv.

  2. Pick g for this choice of x and v.

  3. We have g(t)=f(x+tv).

  4. In particular, g(0)=f(x) and g(1)=f(y).

  5. Since g is convex, hence for every t[0,1],

    f((1t)x+ty)=f(x+tv)=g(t)=g((1t)0+t1)(1t)g(0)+tg(1)=(1t)f(x)+tf(y).
  6. We have a contradiction.

  7. Thus, f must be convex.

A good application of this result is in showing the concavity of the log determinant function in Example 9.49 below.

9.8.3. Epigraph#

The epigraph of a function f:VR is given by:

epif={(x,t)VR|xdomf,f(x)t}.

VR is the direct sum of V and R having appropriate vector space structure.

../_images/pic_epigraph_x_sqr.png

Fig. 9.8 Epigraph of the function f(x)=x2.#

The definition of epigraph also applies for extended real valued functions f:VR.

9.8.3.1. Convex Functions#

Theorem 9.71 (Function convexity = Epigraph convexity)

Let V be a real vector space. A function f:VR is convex if and only if its epigraph epif is a convex set.

This statement is also valid for extended real valued functions.

Proof. Let C=epif.

Assume f is convex.

  1. Let (x1,t1),(x2,t2)epif.

  2. Then, f(x1)t1 and f(x2)t2.

  3. Let r[0,1].

  4. Consider the point (x,t)=r(x1,t1)+(1r)(x2,t2).

  5. We have x=rx1+(1r)(x2).

  6. And t=rt1+(1r)t2.

  7. Then, rf(x1)+(1r)f(x2)rt1+(1r)t2=t.

  8. Since f is convex, hence f(x)=f(rx1+(1r)(x2))rf(x1)+(1r)f(x2).

  9. But then, f(x)t.

  10. Thus, (x,t)epif.

  11. Thus, (x1,t1),(x2,t2)epif implies r(x1,t1)+(1r)(x2,t2)epif for all r[0,1].

  12. Thus, epif is convex.

Assume epif is convex.

  1. Let x1,x2domf.

  2. Let t1=f(x1) and t2=f(x2).

  3. Then, (x1,t1),(x2,t2)epif.

  4. Let r[0,1].

  5. Let x=rx1+(1r)x2.

  6. Since epif is convex, hence r(x1,t1)+(1r)(x2,t2)epif.

  7. i.e., (rx1+(1r)x2,rt1+(1r)t2)epif.

  8. Thus, x=rx1+(1r)x2domf.

  9. Thus, domf is convex.

  10. And, f(x)rt1+(1r)t2.

  11. But rt1+(1r)t2=rf(x1)+(1r)f(x2).

  12. Thus, f(rx1+(1r)x2)rf(x1)+(1r)f(x2).

  13. Thus, f is convex as x1,x2domf and r[0,1] were arbitrary.

Note that domf is the projection of epif from VR to V. Due to Theorem 9.31, epif convex implies domf convex as the projection is a linear operation.

Theorem 9.72 (Convex function from convex set by minimization)

Let C be a nonempty convex set in VR. Let f:VR be the function defined by

f(x)=inf{wR|(x,w)C}xV.

Then f is convex.

Proof. We show the convexity of f by showing the convexity of its epigraph.

  1. Let (x,s) and (y,t) be two points in epif.

  2. Then f(x)s and f(y)t.

  3. By the definition of f (infimum rule), for every kN, there exists (x,sk)C such that sks+1k.

  4. Similarly, for every kN, there exists (y,tk)C such that tkt+1k.

  5. Consider the sequences {(x,sk)} and {(y,tk)}.

  6. By the convexity of C, for every r[0,1] and every k

    (rx+(1r)y,rsk+(1r)tk)C.
  7. Hence for every k

    f(rx+(1r)y)rsk+(1r)tk.
  8. Taking the limit k, we have

    f(rx+(1r)y)rs+(1r)t.
  9. Hence (rx+(1r)y,rs+(1r)t)epif for every r[0,1].

  10. Hence epif is convex.

  11. Hence f is convex.

9.8.3.2. Nonnegative Homogeneous Functions#

Recall that a real valued function f:VR is nonnegative homogeneous if for every tR+, f(tx)=tf(x).

Theorem 9.73 (Nonnegative homogeneity = Epigraph is cone)

A function f:VR is nonnegative homogeneous if and only if its epigraph epif is a cone in VR.

Proof. Let f be nonnegative homogeneous.

  1. Let (x,f(x))epif.

  2. Let t0.

  3. Then, t(x,f(x))=(tx,tf(x))=(tx,f(tx)).

  4. But (tx,f(tx))epif since f is nonnegative homogeneous.

  5. Thus, VR is closed under nonnegative scalar multiplication.

  6. Thus, epif is a cone.

Assume epif is a cone.

  1. Let xdomf.

  2. Then, (x,f(x))epif.

  3. Since (0,0)epif as epif is a cone, hence f(0x)=0f(x)=0.

  4. Now, let t>0.

  5. Since epif is a cone then, t(x,f(x))epif.

  6. Then, t(x,f(x))=(tx,tf(x))epif.

  7. By definition of epigraph, f(tx)tf(x).

  8. We claim that f(tx)=tf(x) must hold true.

  9. Assume for contradiction that f(tx)=sf(x) where s<t.

  10. Then, (tx,sf(x))epif.

  11. Since epif is a cone, hence dividing by t, we get, (x,stf(x))epif.

  12. But then, f(x)stf(x)<f(x) which is a contradiction.

  13. Hence, f(tx)=tf(x) must hold true.

9.8.3.3. Nonnegative Homogeneous Convex Functions#

Theorem 9.74 (Nonnegative homogeneous convex function epigraph characterization)

Let V be a real vector space. A real valued function f:VR is nonnegative homogeneous and convex if and only if its epigraph epif is a convex cone.

Proof. By Theorem 9.73, f is nonnegative homogeneous if and only if epif is a cone.

By Theorem 9.71, f is convex if and only if epif is convex.

Thus, f is nonnegative homogeneous and convex if and only if its epigraph epif is a convex cone.

Theorem 9.75 (Nonnegative homogeneous convex function is subadditive)

Let V be a real vector space. A nonnegative homogeneous function f:VR is convex if and only if it is subadditive:

f(x+y)f(x)+f(y)x,ydomf.

Proof. Assume that f is nonnegative homogeneous and convex.

  1. By Theorem 9.74, epif is a convex cone.

  2. Then, by Theorem 9.47 epif is closed under addition and nonnegative scalar multiplication.

  3. Pick any x,ydomf.

  4. Then (x,f(x)),(y,f(y))epif.

  5. Then, their sum (x+y,f(x)+f(y))epif.

  6. This means that f(x+y)f(x)+f(y) by definition of epigraph.

Now for the converse, assume that f is nonnegative homogeneous and subadditive.

  1. By Theorem 9.73, epif is a cone.

  2. Consequently, it is closed under nonnegative scalar multiplication.

  3. Pick any x,ydomf.

  4. Let (x,α),(y,β)epif.

  5. Then, f(x)α and f(y)β.

  6. Now, (x+y,f(x+y))epif.

  7. Since f is subadditive, hence f(x+y)f(x)+f(y)α+β.

  8. Thus, (x+y,α+β)epif.

  9. We have shown that for any (x,α),(y,β)epif, (x+y,α+β)epif.

  10. Thus, epif is closed under vector addition.

  11. Since epif is closed under vector addition and nonnegative scalar multiplication, hence epif is a convex cone.

  12. But then, by Theorem 9.74, f is convex.

Corollary 9.5

Let V be a real vector space. Let f:VR be a nonnegative homogeneous convex function. Then,

f(x)f(x)xdomf.

Proof. Since f is nonnegative homogeneous convex, hence f is subadditive.

Thus,

f(xx)f(x)+f(x).

But,

f(xx)=f(0)=f(00)=0f(0)=0.

Thus,

f(x)+f(x)0.

Thus,

f(x)f(x).

Theorem 9.76 (Linearity of nonnegative homogeneous functions)

Let V be a real vector space. Let f:VR be a nonnegative homogeneous convex function. Then, f is linear on a subspace L of V if and only if f(x)=f(x) for every xL.

If L is finite dimensional, then this is true if merely f(bi)=f(bi) for all the vectors in some basis b1,,bn for L.

Proof. We are given that f is nonnegative homogeneous convex.

Assume that f is linear over L. Then by definition of linearity, f(x)=f(x) for every xL.

Now, for the converse, assume that f(x)=f(x)xL.

  1. Let x,yL.

  2. Then, f(x+y)f(x)+f(y) since f is subadditive.

  3. Also,

    f(x+y)=f((x+y))=f(x+(y)).
  4. But

    f(x+(y))f(x)+f(y)=f(x)f(y)=(f(x)+f(y)).
  5. Thus,

    f(x+y)=f(x+(y))f(x)+f(y).
  6. Combining, we get f(x+y)=f(x)+f(y). Thus, f is additive over L.

  7. For any t<0,

    f(tx)=f(tx)=(t)f(x)=tf(x).
  8. Thus, for any tR, f(tx)=tf(x).

  9. Thus, f is homogeneous over L.

  10. Since f is additive and homogeneous over L, hence f is linear.

Finally, assume that L is finite dimensional and has a basis b1,,bn.

  1. By previous argument f is homogeneous on the basis vectors; i.e., f(tbi)=tf(bi) for any tR and any basis vector.

  2. Let xL.

  3. Then, x=i=1naibi.

  4. Since f is subadditive, hence

    f(a1b1)++f(anbn)f(a1b1++anbn)=f(x)f(x)=f(a1b1anbn)f(a1b1)f(anbn)=f(a1b1)++f(anbn).
  5. This can hold only if all the inequalities are equalities in the previous derivation.

  6. Thus, f(x)=f(x) or f(x)=f(x) for every xL.

  7. Then, following the previous argument, f is linear over L.

9.8.4. Extended Value Extensions#

Tracking domains of convex functions is difficult. It is often convenient to extend a convex function f:VR with a domain domfV to all of V by defining it to be outside its domain.

Definition 9.46 (Extended value extension)

The extended value extension of a convex function f:VR is defined as f~:VR given by:

f~(x){f(x)forxdomfforxdomf.

We mention that the defining inequality (9.1) for a convex function f remains valid (rather gets extended) for its extended value extension too. If f is convex over domf, then for any x1,x2V, and 0<t<1, we have

f~(tx1+(1t)x2)tf~(x1)+(1t)f~(x2).
  1. At t=0 and t=1, this is always an equality.

  2. Since domf is convex, hence if both x1,x2domf, then tx1+(1t)x2domf too.

  3. The inequality then reduces to (9.1).

  4. If either x1 or x2 is not in domf, then the R.H.S. becomes and the inequality stays valid.

Definition 9.47 (Effective domain)

The effective domain of an extended valued function f~:VR is defined as:

domf~={xV|f~(x)<}.

An equivalent way to define the effective domain is:

domf~={xV|tR with (x,t)epif}.

Theorem 9.77 (Effective domain and sum of functions)

If f and g are two extended valued functions, then dom(f+g)=domfdomg.

Proof. We proceed as follows:

  1. At any xdomfdomg, both f(x) and g(x) are finite.

  2. Thus, f(x)+g(x) is finite.

  3. Hence, xdom(f+g).

  4. At any xV(domfdomg), either f(x)= or g(x)=.

  5. Thus, (f+g)(x)=f(x)+g(x)=.

  6. Thus, xdom(f+g).

Several commonly used convex functions have vastly different domains. If we work with these functions directly, then we have to constantly worry about identifying the domain and manipulating the domain as per the requirements of the operations we are considering. This quickly becomes tedious.

An alternative approach is to work with the extended value extensions of all functions. We don’t have to worry about tracking the function domain. The domain can be identified whenever needed by removing the parts from V where the function goes to .

In this book, unless otherwise specified, we shall assume that the functions are being treated as their extended value extensions.

9.8.5. Proper Functions#

Definition 9.48 (Proper function)

An extended real-valued function f:VR is called proper if its domain is nonempty, it never takes the value and is not identically equal to .

xV such that f(x)< and f(x)>xV.

In other words, epif is nonempty and contains no vertical lines.

Putting another way, a proper function is obtained by taking a real valued function f defined on a nonempty set CV and then extending it to all of V by setting f(x)=+ for all xC.

It is easy to see that the codomain for a proper function can be changed from R to (,] to clarify that it never takes the value .

Definition 9.49 (Improper function)

An extended real-valued function f:VR is called improper if it is not proper.

For an improper function f:

  • domf may be empty.

  • f might take a value of at some xV.

Most of our study is focused on proper functions. However, improper functions sometimes do arise naturally in convex analysis.

Example 9.32 (An improper function)

Consider a function f:RR as described below:

f(x)={ if |x|<10 if |x|=1 if |x|>1.

Then, f is an improper function.

9.8.6. Indicator Functions#

Definition 9.50 (Indicator function)

Let CV. Then, its indicator function is given by IC(x)=0xC. Here, domIC=C.

The extended value extension of an indicator function is given by:

IC~(x){0forxCforxC.

Theorem 9.78 (Indicator functions and convexity)

An indicator function is convex if and only if its domain is a convex set.

Proof. Let C be convex. Let IC be its indicator function. Let x,yC. Then, for any t[0,1]

IC(tx+(1t)y)=0tIC(x)+(1t)IC(x)=0.

since tx+(1t)yC as C is convex.

Thus, IC is convex.

If C is not convex, then domIC is not convex. Hence, IC is not convex.

Theorem 9.79 (Restricting the domain of a function)

Let f be a proper function. Then,

dom(f+IC)=domfC.

Also,

(f+IC)(x)=f(x)xdomfC.

The statement is obvious. And quite powerful.

  • The problem of minimizing a function f over a set C is same as minimizing f+IC over V.

9.8.7. Sublevel Sets#

Recall from Definition 2.53 that the α-sublevel set for a real valued function f:VR is given by

Cα={xdomf|f(x)α}.

The strict sublevel sets for a real valued function f:VR can be defined as

Oα={xdomf|f(x)<α}.

The sublevel sets can be shown to be intersection of a set of strict sublevel sets.

Theorem 9.80 (Sublevel set as intersection)

Let V be a real vector space. Let f:VR be a real valued function. Let

Oα={xdomf|f(x)<α}

denote the strict sublevel set of f for α. Let

Cα={xdomf|f(x)α}

denote the sublevel set of f for α. Then,

Cα=μ>αOμ.

Proof. We show that Cαμ>αOμ.

  1. Let xCα.

  2. Then, f(x)α.

  3. Thus, f(x)<μ for every μ>α.

  4. Thus, xOμ for every μ>α.

  5. Thus, Cαμ>αOμ.

We now show that Cαμ>αOμ.

  1. Let xμ>αOμ.

  2. Then, f(x)<μ for every μ>α.

  3. Taking the infimum on the R.H.S. over the set {μR|μ>α}, we obtain

    f(x)inf{μR|μ>α}=α.
  4. Thus, xCα.

  5. Thus, μ>αOμCα.

Theorem 9.81 (Convexity of sublevel sets)

If f:VR is convex, then its sublevel sets given by

Cα={xdomf|f(x)α}

are convex.

Proof. Assume f is convex.

  1. Let x,yCα.

  2. Then, f(x)α and f(y)α.

  3. Let t[0,1].

  4. Let z=tx+(1t)y.

  5. Since f is convex, hence:

    f(z)tf(x)+(1t)f(y)tα+(1t)α=α.
  6. Thus, f(z)α.

  7. Thus, zCα.

  8. Thus, Cα is convex.

The converse is not true. A function need not be convex even if all its sublevel sets are convex.

Example 9.33

Consider the function f(x)=lnx. It is concave (Example 9.45).

Its sublevel sets are convex as they are intervals.

Theorem 9.82 (Convexity of strict sublevel sets)

If f:VR is convex, then its strict sublevel sets given by

Oα={xdomf|f(x)<α}

are convex.

Proof. Assume f is convex.

  1. Let x,yOα.

  2. Then, f(x)<α and f(y)<α.

  3. Let t(0,1).

  4. Let z=tx+(1t)y.

  5. Since f is convex, hence:

    f(z)tf(x)+(1t)f(y)<tα+(1t)α=α.
  6. Thus, f(z)<α.

  7. Thus, zOα.

  8. Thus, Oα is convex.

An alternate proof for showing the convexity of the sublevel sets is to show it as an intersection of strict sublevel sets.

Theorem 9.83 (Intersection of sublevel sets of convex functions)

Let V be a real vector space. Let I be an arbitrary index set. Let fi:V(,] be convex functions for every iI. Let αiR for every iI. Then,

C={x|fi(x)αi,iI}

is a convex set.

Proof. For each iI, consider the set

Ci={x|fi(x)αi}.

Then, Ci is a sublevel set of fi which is convex. Hence, Ci is convex. Now, we can see that

C=iICi

Thus, C is an intersection of convex sets. Hence, C is convex.

Example 9.34 (Convexity of sublevel sets of the quadratic)

Let QSn be a positive semidefinite matrix. Let aRn and cR.

Consider the sets of the form

{xRn|12x,Qx+x,a+c0}.

This is a sublevel set of the quadratic function f(x)=12x,Qx+x,a+c.

Since f is convex, hence the set is convex.

Sets of this form include the solid ellipsoids, paraboloids as well as spherical balls. Here is an example of the spherical ball of the norm induced by the inner product.

{x|x1}={x|x,x10}.

9.8.8. Hypograph#

The hypograph of a function f:VR is given by:

hypof={(x,t)VR|xdomf,f(x)t}.

Just like function convexity is connected to epigraph convexity, similarly function concavity is connected to hypograph convexity.

Theorem 9.84 (Function concavity = Hypograph convexity)

A function f is concave if and only if its hypograph hypof is a convex set.

Proof. f is concave if and only if f is convex if and only if the epigraph of f is convex if and only if the hypograph of f is convex.

9.8.9. Super-level Sets#

Recall from Definition 2.56 that the α-sublevel set for a real valued function f:VR is given by

Dα={xdomf|f(x)α}.

Theorem 9.85

If f:VR is concave, then its super-level sets are convex.

Proof. Assume f is concave.

  1. Let x,yDα.

  2. Then, f(x)α and f(y)α.

  3. Let t[0,1].

  4. Let z=tx+(1t)y.

  5. Since f is concave, hence:

    f(z)tf(x)+(1t)f(y)tα+(1t)α=α.
  6. Thus, f(z)α.

  7. Thus, zDα.

  8. Thus, Dα is convex.

The converse is not true. A function need not be concave even if all its super-level sets are convex.

Example 9.35

Let geometric mean be given by:

g(x)=(i=1nxi)1n

with domg=R+n (with the interpretation that 01n=0).

Let arithmetic mean be given by:

a(x)=1ni=1nxi.

Consider the set:

A={xR+n|g(x)αa(x)}.

A is the set of nonnegative vectors where the geometric mean of the components is at least α times larger than the arithmetic mean.

  1. In Example 9.30 we establish that g is concave.

  2. In Example 9.26 we established that a is convex as well as concave.

  3. Thus, αa is concave.

  4. Thus, gαa is concave (with domgαa=R+n).

  5. Note that A can be redefined as:

    A={xR+n|g(x)αa(x)0}.
  6. Thus, A is a 0-super-level set of gαa.

  7. Since gαa is concave, hence A is convex.

9.8.10. Closed Convex Functions#

Recall from Real Valued Functions that a function is closed if all its sublevel sets are closed. A function is closed if and only if its epigraph is closed. A function is closed if and only if it is lower semicontinuous.

In general, if a function is continuous, then it is lower semicontinuous and hence it is closed.

In this subsection, we provide examples of convex functions which are closed.

9.8.10.1. Affine Functions#

Theorem 9.86 (Affine functions are closed)

Let f:VR be given by

f(x)=x,a+b

where aV and bR. Then f is closed.

Proof. We prove closedness by showing that the epigraph of f is closed.

  1. Let {xk,tk} be a converging sequence of epif.

  2. Let (x,t)=limk(xk,tk).

  3. We have f(xk)tk for every k.

  4. In other words

    xk,a+btkkN.
  5. Taking the limit on both sides, we get

    x,a+bt.
  6. Hence f(x)t.

  7. Hence (x,t)epif.

  8. Hence epif is closed.

9.8.10.2. Norms#

Theorem 9.87 (All Norms are closed)

Let :VR be a norm on a real vector space V. Then is closed.

Proof. The sublevel sets are given by St={xV|xt}. They are nothing but the closed balls of radius t around 0 and by definition closed. Hence all sublevel sets are closed. Thus is closed.

9.8.11. Support Functions#

Definition 9.51 (Support function for a set)

Let V be a real inner product space. Let C be a subset of V. The support function σC:V(,] is defined as

σC(x)=sup{x,y|yC}.

Since V and V are isomorphic, support function σC:V(,] for a set CV is similarly defined as

σC(x)=sup{y,x|yC}.

Example 9.36 (Finite sets)

Let C={b1,,bm} be a finite subset of V. Then,

σC(x)=max{x,b1,,x,bm}.

This follows directly from the definition.

9.8.11.1. Convexity#

Theorem 9.88 (Convexity of support function)

Let V be a real inner product space. Let C be a nonempty subset of V. Then, the support function σC:V(,] is convex.

Proof. Fix a yC and consider the function σy:VR given by

σy(x)=x,y.

σy is linear and accordingly convex. Then,

σC(x)=supyCσy(x)

is a pointwise supremum of convex functions. By Theorem 9.114, σC is convex.

We note that the convexity of the support function σC has nothing to do with the convexity of the underlying set C.

9.8.11.2. Closedness#

Theorem 9.89 (Closedness of support function)

Let V be a real inner product space. Let C be a nonempty subset of V. Then, the support function σC:V(,] is closed.

Proof. Recall that a function is closed if all its sublevel sets are closed.

  1. Let aR.

  2. Consider the sublevel set Sa={xV|σC(x)a}.

  3. Then,

    Sa={xV|supyCx,ya}
  4. Thus,

    Sa={xV|x,yayC}
  5. Define Ay as

    Ay={xV|x,ya}.
  6. Then,

    Sa=yCAy.
  7. Now, Ay is a closed set since x,y is a continuous function.

  8. Thus, Sa is an intersection of closed sets.

  9. Thus, Sa is closed.

  10. Thus, all sublevel sets of σC are closed.

  11. Thus, σC is closed.

9.8.11.3. Equality of Underlying Sets#

Theorem 9.90 (Equality of underlying sets for support functions)

Let A,BV be nonempty, closed and convex sets. Then, A=B if and only if σA=σB.

Proof. If A=B then obviously σA=σB. Now, assume that σA=σB.

  1. For contradiction, assume that AB.

  2. Without loss of generality, assume that there exists yA such that yB.

  3. Since yB and B is a closed convex set, hence, by Theorem 9.169, there exists a hyperplane H strongly separating y from B.

  4. Thus, there exists pV and αR such that

    x,pα<y,pxB.
  5. Taking supremum over B on the L.H.S., we obtain

    σB(p)α<y,pσA(p).
  6. Thus, there exists pV such that

    σB(p)<σA(p).
  7. This contradicts our hypothesis that σA and σB are identical.

  8. Thus, A=B must hold.

9.8.11.4. Closure and Convex Hull#

The next result shows that support function for a set and its closure or its convex hull are identical. This is why, we required A,B to be closed and convex in the previous result.

Theorem 9.91 (Support functions and closure or convex hull of underlying set)

Let AV. Then,

  1. σA=σclA.

  2. σA=σconvA.

Proof. We first consider the case of closure.

  1. AclA.

  2. Thus, σA(y)σclA(y)yV.

  3. Let us now show the reverse inequality.

  4. Let yV.

  5. Then, there exists a sequence {xk} of clA such that

    limkxk,y=σclA(y).
  6. Now for every xkclA, there exists a point zkA such that d(xk,zk)1k.

  7. Thus, limk(zkxk)=0.

  8. Since zkA, hence

    σA(y)zk,y=xk,y+zkxk,y.
  9. Taking the limit k on the R.H.S., we obtain

    σA(y)σclA(y)+0=σclA(y).
  10. Thus, σA(y)=σclA(y) must hold true.

  11. Since this is true for every yV, hence σA=σclA.

Now, consider the case of convex hull.

  1. By definition, AconvA.

  2. Thus, σA(y)σconvA(y)yV.

  3. Let yV.

  4. Then, there exists a sequence {xk} of convA such that

    limkxk,y=σconvA(y).
  5. Since xkconvA, hence, there exists z1k,,znkkA and tkΔnk such that

    xk=i=1nktikzik.
  6. By linearity of the inner product

    xk,y=i=1nktikzik,y=i=1nktikzik,yi=1nktikσA(y)=σA(y).
  7. Taking the limit k on the L.H.S., we obtain

    σconvA(y)=limkxk,yσA(y).
  8. Thus, σA(y)=σconvA(y) must hold true.

  9. Since this is true for every yV, hence σA=σconvA.

9.8.11.5. Arithmetic Properties#

Following properties of support functions are useful in several applications.

Theorem 9.92 (Arithmetic properties of support functions)

  1. (Nonnegative homogeneity) For any nonempty set CV and a vector xV and t0,

    σC(tx)=tσC(x).
  2. (Subadditivity) For any nonempty set CV and a vector u,vV,

    σC(u+v)σC(u)+σC(v).
  3. (Nonnegative scaling of the underlying set) For any nonempty set CV and a vector xV and t0

    σtC(y)=tσC(y).
  4. (Additivity over Minkowski sum of sets) For any two nonempty subsets A,BV and xV

    σA+B(x)=σA(x)+σB(x).

Proof. (1) Nonnegative homogeneity

σC(tx)=supyCtx,y=supyCtx,y=tsupyCx,y=tσC(x).

Here, we used the fact that sup commutes with nonnegative scalars.

(2) Subadditivity

σC(u+v)=supyCu+v,y=supyC(u,y+u,y)supyCu,y+supyCv,y=σC(u)+σC(v).

(3) Nonnegative scaling of the underlying set

σtC(x)=supytCx,y=supyCx,ty=supyCtx,y=tsupyCx,y=tσC(x).

(4) Minkowski sum

σA+B(x)=supyA+Bx,y=supuA,vBx,u+v=supuA,vB(x,u+x,v)=supuAx,u+supvBx,v=σA(x)+σB(x).

9.8.11.6. Cones#

Recall from Definition 9.22 that a set C is called a cone if for every xC and every t0, txC. Also, recall from Definition 9.34 that the polar cone of a set C is given by

C={yV|x,y0xC}.

Theorem 9.93 (Support function of a cone)

Let C be a given cone. Then,

σC(y)=IC(y)yV.

In words, the support function of a cone C is the indicator function of the polar cone of C.

Proof. We proceed as follows

  1. Assume that yC.

  2. Then, x,y0xC.

  3. In particular, 0C since C is a cone.

  4. Accordingly, 0,y=0.

  5. Thus,

    σC(y)=supxCx,y=0.
  6. Now consider yC.

  7. Then, there exists uC such that u,y>0.

  8. Since C is a cone, hence tuC for all t0.

  9. Accordingly,

    σC(y)=supxCx,ytu,yt0.
  10. Taking the limit t, we see that

    σC(y)=.
  11. Thus, σC(y)=0 for all yC and σC(y)= otherwise.

  12. Thus, σC=IC.

Example 9.37 (Support function of nonnegative orthant)

Let V=Rn and C=R+n. C is the nonnegative orthant which is a closed convex cone. Its polar cone is given by

C=Rn

which is the nonpositive orthant {xRn|x0}. By Theorem 9.93,

σR+n(y)=IRn(y).

9.8.11.7. Affine Sets#

Theorem 9.94 (Support function for an affine set)

Let BRm×n and bRm. Define the set CRn as

C={xRn|Bx=b}.

Assume that C is nonempty and let x0C be one of the solutions of the system of equations Bx=b. Then

σC(y)=x0,y+Irange(BT)(y).

Proof. We proceed as follows.

  1. By definition of support function

    σC(y)=sup{x,y|Bx=b}.
  2. Introduce a variable z=xx0.

  3. Then x=z+x0.

  4. Accordingly

    σC(y)=sup{z+x0,y|B(z+x0)=b}=x0,y+sup{z,y|Bz=0}=x0,y+σD(y)

    where D={x|Bx=0}.

  5. We note that the statement Bx=0 is equivalent to Bx0 and Bx0.

  6. In other words, D={xRn|Ax0} where A=[BB].

  7. The set D is a convex polyhedral cone.

  8. By Theorem 9.93, the support function of a cone is the indicator function of its polar cone.

  9. By Theorem 9.66, the polar cone is given by

    D={BTt1BTt2|t1,t20}.
  10. It is easy to see that D=range(BT).

    1. Every vector tRm can be split into two vectors t1,t2R+m such that t=t1t2.

    2. Accordingly BTt=BTt1BTt2.

  11. This gives us

    σC(y)=x0,y+ID(y).

9.8.11.8. Norm Balls#

Theorem 9.95 (Support functions for unit balls)

Let V be a real vector space endowed with a norm :VR. Consider the (closed) unit ball given by

C=B[0,1]={xV|x1}.

Then, the support function is given by

σC(y)=y

where :VR represents the dual norm.

Proof. This flows directly from the definitions of support function and dual norm.

σC(y)=sup{x,y|xC}=sup{x,y|x1}=y.

9.8.12. Gauge Functions#

Definition 9.52 (Gauge function for a set)

Let V be a real vector space. Let C be a nonempty subset of V. The gauge function γC:V(,] is defined as

γC(x)=inf{r0|xrC}.

If xrC for any r0, then γC(x)=. This is consistent with the convention that inf=.

The gauge function is also known as Minkowski functional.

Property 9.33 (Nonnegativity)

The Gauge function is always nonnegative.

Property 9.34 (Value at origin)

γC(0)=0.

Property 9.35 (Subadditive)

If C is convex, then the gauge function is subadditive.

γC(x+y)γC(x)+γC(y).

Proof. We proceed as follows.

  1. If γC(x)= or γC(y)=, then the inequality is satisfied trivially. So assume that both are finite.

  2. Then, the sets X={r0|xrC} and Y={r0|yrC} are not empty.

  3. Thus, we can choose some sγC(x) from X and tγC(y) from Y.

  4. If s=0 or t=0, then 0C. Consequently,

    γC(x+y)=γC(x)=γC(y)=0γC(x)+γC(y)s+t.

    and the inequality is satisfied.

  5. Now, consider the case where s>0 and t>0.

  6. Then, xsC and ytC.

  7. Now,

    x+ys+t=ss+txs+ts+tyt.
  8. Thus, x+ys+t is a convex combination of xs and yt.

  9. Since C is convex, hence x+ys+tC.

  10. Thus, s+t{r0|(x+y)rC}.

  11. Thus, γC(x+y)s+t.

  12. Thus, for every sγC(x) and every tγC(y), γC(x+y)s+t.

  13. Taking infimum on the R.H.S. over sX and tY, we obtain, γC(x+y)γC(x)+γC(y).

Property 9.36 (Homogeneous)

The Gauge function is homogeneous.

γC(sx)=|s|γC(x)sR.

Property 9.37 (Seminorm)

The Gauge function is a seminorm.

Example 9.38 (Norm as a gauge function)

Let V be a normed linear space with the norm :VR.

Let B={xV|x1} be the unit closed ball.

Then,

γB(x)=inf{r0|xrB}=x.

The gauge function for the closed unit ball is simply the norm itself.

9.8.13. Jensen’s Inequality#

Jensen’s inequality stated below is another formulation for convex functions.

Theorem 9.96 (Jensen’s inequality)

A proper function f:V(,] is convex if and only if

(9.2)#f(t1x1++tkxk)t1f(x1)++tkf(xk)

holds true for every x1,,xkV whenever t1,,tk0 and t1++tk=1. This inequality is known as the Jensen’s inequality.

Proof. The Jensen’s inequality reduces to (9.1) for k=2. Thus, the statement is true by definition for k=2. For k>2, we shall present an inductive proof.

Assume f is convex. Then domf is convex and for all x1,x2domf, and t[0,1], we have:

f(tx1+(1t)x2)tf(x1)+(1t)f(x2).
  1. Let x1,,xkV.

  2. If any of xidomf for some i1,,k, then f(xi)= and the Jensen’s inequality holds trivially.

  3. Thus, we shall assume that x1,,xkdomf.

  4. Since domf is convex, hence their convex combination t1x1++tkxkdomf.

  5. Inductively, assume that the Jensen’s inequality holds for k1; i.e.,

    f(r1x1++rk1xk1)r1f(x1)++rk1f(xk1)

    holds true whenever r1,,rk10 and r1++rk1=1.

  6. WLOG, assume that tk<1. Thus, 1tk>0.

  7. Define y=i=1k1tixi where ti=ti1tk.

  8. Note that ti0. Also, i=1k1ti=1 since i=1k1ti=1tk.

  9. We can now write:

    f(t1x1++tkxk)=f((1tk)y+tkxk)(1tk)f(y)+tkf(xk)=(1tk)f(t1x1+tk1xk1)+tkf(xk)(1tk)(t1f(x1)++tk1f(xk1))+tkf(xk)=t1f(x1)++tk1f(xk1)+tkf(xk).
  10. Thus, f satisfies Jensen’s inequality.

For the converse, assume that f satisfies Jensen’s inequality. Let x1,x2domf and t[0,1]. Then, by Jensen’s inequality for k=2,

f(tx1+(1t)x2)tf(x1)+(1t)f(x2)<.

Thus, tx1+(1t)x2domf. Thus, domf is convex. Also, f satisfies (9.1). Hence, f is convex.

Jensen’s inequality is essential in proving a number of famous inequalities.

Example 9.39 (Logarithm and Jensen’s inequality)

In Example 9.45, we show that ln(x) is concave. Consequently, ln(x) is convex.

Now, let x1,,xnR++ be positive real numbers and let t1,,tn0 such that t1++tn=1. Then, by Jensen’s inequality

ln(t1x1++tnxn)t1lnx1tnlnxn.

Multiplying by 1 and taking exponential on both sides, we obtain

t1x1++tnxnx1t1xntn.

For a particular choice of t1==tn=1n, we obtain

1n(x1++xn)x1xnn

which is the AM-GM inequality suggesting that arithmetic mean is greater than or equal to the geometric mean for a group of positive real numbers.

Theorem 9.97 (Jensen’s inequality for nonnegative homogeneous convex functions)

If f:V(,] is a nonnegative homogeneous proper convex function, then

(9.3)#f(t1x1++tkxk)t1f(x1)++tkf(xk)

holds true for every x1,,xkV whenever t1,,tk0.

Proof. Let x1,,xkV. If any of xidomf for some i1,,k, then f(xi)= and the inequality holds trivially. Thus, we shall assume that x1,,xkdomf.

By Theorem 9.75, f is subadditive. Thus,

f(t1x1++tkxk)f(t1x1)++f(tkxk).

The nonnegative homogeneity gives us

f(t1x1)++f(tkxk)=t1f(x1)++tkf(xk).

We are done.

9.8.14. Quasi-Convex Functions#

Definition 9.53 (Quasi convex function)

Let V be a real vector space. Let f:VR be a real valued function. Let the sublevel sets of f be given by

Cα={xdomf|f(x)α}.

If the sublevel sets Cα of f are convex for every αR, then f is called a quasi-convex function.