9.15. Directional Derivatives#

This section deals with the subject of directional derivatives for convex functions.

9.15.1. Convex Real Functions#

For real functions f:RR, several useful results are available.

Remark 9.13 (Domain of a convex real function)

The domain of a convex real function is an interval.

Let f:RR be convex. Then domf is convex. But every convex subset of real line is an interval.

  1. Now let I=domf be an interval.

  2. We say a=infI as the left end point of I.

  3. We say b=supI as the right end point of I.

  4. a and b may or may not belong to I.

  5. If both a and b belong to I, then I is a closed interval.

  6. If neither a nor b belongs to I, then I is an open interval.

9.15.1.1. Characterization#

Theorem 9.199 (Characterization of real convex functions)

Let f:RR be a real function with domf=I which is an interval (closed or open or semi-open). Let a and b be the left and right endpoints of the interval I.

The following are equivalent.

  1. f is convex over I.

  2. For every x1,x2,x3I with x1<x2<x3,

    f(x2)f(x1)x2x1f(x3)f(x1)x3x1.
  3. For every x1,x2,x3I with x1<x2<x3,

    f(x2)f(x1)x2x1f(x3)f(x2)x3x2.
  4. For every x1,x2,x3I with x1<x2<x3,

    f(x3)f(x1)x3x1f(x3)f(x2)x3x2.

Proof. (1) (2) Assume that f is convex.

  1. Let

    α=x3x2x3x1,β=x2x1x3x1.
  2. Then, α+β=1 and α,β(0,1).

  3. Also, verify that

    x2=αx1+βx3.
  4. Thus,

    f(x2)αf(x1)+βf(x3)f(x2)x3x2x3x1f(x1)+x2x1x3x1f(x3)(x3x1)f(x2)(x3x2)f(x1)+(x2x1)f(x3)(x3x1)f(x2)((x3x1)(x2x1))f(x1)+(x2x1)f(x3)(x3x1)(f(x2)f(x1))(x2x1)(f(x3)f(x1))f(x2)f(x1)x2x1f(x3)f(x1)x3x1.

(2) (1)

  1. Let x1,x3I and t(0,1).

  2. WLOG, assume that x1<x3.

  3. Let α=t and β=(1t).

  4. Let x2=αx1+βx3.

  5. Then, x1<x2<x3.

  6. From the hypothesis, we have

    f(x2)f(x1)x2x1f(x3)f(x1)x3x1.
  7. Using the previous argument backwards, this implies

    f(x2)αf(x1)+βf(x3)=tf(x1)+(1t)f(x3).
  8. Thus, f is convex.

(2) (3)

  1. Pick any x1,x2,x3I with x1<x2<x3.

  2. By hypothesis (2)

    f(x2)f(x1)x2x1f(x3)f(x1)x3x1(x3x1)(f(x2)f(x1))(x2x1)(f(x3)f(x1))(x3x1)f(x2)((x3x1)(x2x1))f(x1)+(x2x1)f(x3)((x3x2)+(x2x1))f(x2)((x3x2)f(x1)+(x2x1)f(x3)(x3x2)(f(x2)f(x1))(x2x1)(f(x3)f(x2))f(x2)f(x1)x2x1f(x3)f(x2)x3x2.

(2) (4)

  1. Pick any x1,x2,x3I with x1<x2<x3.

  2. By hypothesis (2)

    f(x2)f(x1)x2x1f(x3)f(x1)x3x1(x3x1)(f(x2)f(x1))(x2x1)(f(x3)f(x1))(x3x1)f(x2)(x3x2)f(x1)+((x2x3)+(x3x1))f(x3)(x3x1)(f(x2)f(x3))(x3x2)(f(x1)f(x3))(x3x1)(f(x3)f(x2))(x3x2)(f(x3)f(x1))(x3x2)(f(x3)f(x1))(x3x1)(f(x3)f(x2))f(x3)f(x1)x3x1f(x3)f(x2)x3x2.

9.15.1.2. One Sided Derivatives#

Recall from Definition 2.83 that if f is defined over [a,b), then the right hand derivative is defined as:

f+(a)=limxa+f(x)f(a)xa=limh0f(a+h)f(a)h

if the limit exists. Similarly, if f is defined over (c,a], then the left hand derivative is defined as:

f(a)=limxaf(x)f(a)xa=limh0f(a+h)f(a)h=limr0f(a)f(ar)r

if the limit exists.

We introduce two helper functions

s+(x,h)=f(x+h)f(x)h

and

s(x,h)=f(x)f(xh)h

where h>0.

Then

f+(x)=limh0s+(x,h)

and

f(x)=limh0s(x,h).

An interesting property of convex functions is that the one sided derivatives always exist. On the real line, there are only two directions to move; left and right. The one sided derivatives play the role of directional derivatives on the real line.

Lemma 9.4 (Monotonicity of s+ and s for convex functions)

Let f:RR with domf=I be convex. Let a and b be the left and right endpoints of the interval I. Then s+ and s as functions of h are monotonic.

  1. s+(x,h) is a nondecreasing function of h.

  2. s(x,h) is a nonincreasing function of h.

Proof. Monotonicity of s+.

  1. Let xI such that xb.

  2. Let h1,h2>0 such that h1<h2 and x+h2I.

  3. Consider the three points x<x+h1<x+h2.

  4. Then

    f(x+h1)f(x)h1f(x+h2)f(x)h2.
  5. Hence s+(x,h1)s+(x,h2).

  6. Hence s+ is a nondecreasing function of h.

The argument for monotonicity of s is similar.

Observation 9.6 (One sided derivatives as infimum/supremum)

Due to the monotonicity of s+ over h, we have

f+(x)=limh0s+(x,h)=infh>0s+(x,h).

Similarly, due to the monotonicity of s+ over h, we have

f(x)=limh0s(x,h)=suph>0s(x,h).

Theorem 9.200 (Real convex functions and one sided derivatives)

Let f:RR with domf=I be convex. Let a and b be the left and right endpoints of the interval I. Then, for every xintI=(a,b), the left hand derivative f(x) and the right hand derivative f+(x) exist.

If aI, then the right hand derivative f+(a) exists. If bI, then the left hand derivative f(b) exists.

If aI, we define f(a)=. If bI, we define f+(b)=.

Proof. We are given that f is convex over (a,b).

  1. Let xintI.

  2. Then there exists r>0 such that (xr,x+r)I.

  3. For h>0, define

    F(h)=f(x+h)f(x)h.
  4. Let 0<h1<h2 such that h2<r.

  5. Let x1=x, x2=x+h1, x3=x+h2.

  6. Since f is convex, hence by Theorem 9.199

    f(x2)f(x1)x2x1f(x3)f(x1)x3x1.
  7. But that means

f(x+h1)f(x)h1f(x+h2)f(x)h2.
  1. Thus, whenever h1<h2 (up to h2<r), F(h1)F(h2).

  2. Thus, F is a nondecreasing (monotone) function of h in some interval (0,δ) where δ<r.

  3. Then, f+(x)=limh0F(h) exists.

A similar argument shows that f(x) also exists. Similar arguments apply for the one sided derivatives at the end points.

9.15.1.3. Continuity#

Theorem 9.201 (Convex real function is continuous)

Let f:RR be a real convex function with domf=I. Let a and b be the left and right endpoints of the interval I. Then,

  1. f is continuous at every x(a,b).

  2. If aI, then f is continuous from the right at a.

  3. If bI, then f is continuous from the left at b.

In other words, f is continuous on I.

Proof. We proceed as follows.

  1. Let x(a,b).

  2. By Theorem 9.200, the one sided derivatives f+(x) and f(x) exist.

  3. Then, by limit arithmetic

    limh0(f(x+h)f(x))=(limh0f(x+h)f(x)h)(limh0h)=0.
  4. Similarly,

    limh0(f(x+h)f(x))=(limh0f(x+h)f(x)h)(limh0h)=0.
  5. Thus,

    limh0(f(x+h)f(x))=limh0(f(x+h)f(x))=0.
  6. Thus, f is continuous at x.

  7. Since x was arbitrary, hence f is continuous on (a,b).

Now consider the case where aI.

  1. By Theorem 9.200, the one sided derivative f+(a) exists.

  2. Then, by limit arithmetic

    limh0(f(a+h)f(a))=(limh0f(a+h)f(a)h)(limh0h)=0.
  3. Hence f is continuous from the right at a.

A similar argument holds for continuity from the left at b.

9.15.1.4. Properties of One Sides Derivatives#

Theorem 9.202 (Properties of one-sided derivatives)

Let f:RR be a real convex function with domf=I. Let a and b be the left and right endpoints of the interval I.

  1. We have f(x)f+(x) for every xI.

  2. If xintI then both f(x) and f+(x) are finite.

  3. If x,zI and x<z, then f+(x)f(z).

  4. The functions f,f+:RR are nondecreasing over I.

  5. The function f+ is right-continuous at every interior point of I. If aI then f+ is right-continuous at a.

  6. The function f is left-continuous at every interior point of I. If bI then f is left-continuous at b.

  7. The function f+ is upper-semicontinuous at every xI.

  8. The function f is lower-semicontinuous at every xI.

Proof. (1)

  1. If aI, then by convention f(a)=. Hence f(a)f+(a).

  2. If bI, then by convention f+(b)=. Hence f(b)f+(b).

  3. Now let xintI.

  4. Then there is r>0 such that (xr,x+r)I.

  5. Pick any h>0 such that h<r.

  6. Then, using the three points xh,x,x+h, we have

    f(x)f(xh)hf(x+h)f(x)h

    due to Theorem 9.199.

  7. Taking the limit h0, we see that

    f(x)f+(x)

    holds true for every xintI.

(2)

  1. Let xintI.

  2. Let h>0 such that (xh,x+h)I.

  3. Then we have

    f+(x)f(x+h)f(x)h<.
  4. Similarly, we have

    <f(x)f(xh)hf(x).
  5. By (1), we have

    <f(x)f+(x)<.
  6. Hence both are finite at interior points of I.

(3)

  1. Let y=x+z2.

  2. Due to Theorem 9.199, we have

    f(y)f(x)yxf(z)f(y)zy.
  3. We also have

    f+(x)f(y)f(x)yx and f(z)f(y)zyf(z).
  4. Combining, we get f+(x)f(z).

(4)

  1. Let x,zI such that x<z.

  2. From (3), we have f+(x)f(z).

  3. From (1), we have f(z)f+(z).

  4. Combining, we have f+(x)f+(z).

  5. Hence f+ is nondecreasing.

  6. Similarly, f(x)f+(x)f(z).

  7. Hence f is nondecreasing.

(5)

  1. Pick any xI such that xb (if bI).

  2. Then x<b.

  3. We can pick h>0 and r>0 such that x+h+r<b.

  4. Then f+(x+h)f(x+h+r)f(x+h)r.

  5. We established in Theorem 9.201 that f is continuous.

  6. Taking the limit h0, we obtain

    limh0f+(x+h)f(x+r)f(x)r.

    This is valid since f is continuous.

  7. Now taking the limit r0 on the R.H.S., we obtain

    limh0f+(x+h)f+(x).
  8. Since f+ is nondecreasing by claim (4), hence

    f+(x)limh0f+(x+h).
  9. Together, we must have

    f+(x)=limh0f+(x+h).
  10. Hence f+ is right continuous at x.

(6)

  1. An argument similar to (5) shows that f is left continuous at every xI except for x=a (if aI).

(7) Upper semicontinuity of f+

  1. We need to show that for every ϵ>0 there exists r>0 such that

    f+(y)<f+(x)+ϵ for every y(xr,x+r)I.
  2. Pick some ϵ>0.

  3. Consider any xintI.

  4. By (6) f+ is right continuous at x.

  5. Hence there exists r1>0 such that for every y[x,x+r1), we have

    |f+(y)f+(x)|<ϵ.
  6. By (4), f+ is nondecreasing. Hence for every y[x,x+r1)

    |f+(y)f+(x)|=f+(y)f+(x).
  7. Hence for every y[x,x+r1), we have

    f+(y)f+(x)<ϵf+(y)<f+(x)+ϵ.
  8. Now, let r=min(xa,r1).

  9. By monotonicity of f+, for every y(xr,x)

    f+(y)f+(x).
  10. Hence for every y(x,r,x)

    f+(y)<f+(x)+ϵ.
  11. Combining the two, for every y(xr,x+r), we have

    f+(y)<f+(x)+ϵ.
  12. Hence f+ is u.s.c. at x.

  13. Now, if aI then let x=a.

  14. By right continuity of f+ at a, there exists r>0 such that for every y[a,a+r), we have

    f+(y)<f+(a)+ϵ.
  15. Also (ar,a+r)I=[a,a+r).

  16. Hence f+ is u.s.c. at a.

  17. If bI, then by convention f+(b)=.

  18. Hence f+ is u.s.c. at b.

(8) Lower semicontinuity of f

  1. The argument is similar to (7).

9.15.2. Directional Derivatives of Proper Functions#

Definition 9.70 (Directional derivative)

Let f:V(,] be a proper function with S=domf. Let xintS. The directional derivative at x in the direction dV is defined by

f(x;d)limα0f(x+αd)f(x)α

provided the limit exists.

We say that f is directionally differentiable at x if it is directionally differentiable in every direction at x.

The directional derivative is a scalar quantity (R) if it is defined (i.e., the limit exists). When we say that

f(x;d)=limt0f(x+td)f(x)t,

we mean that f is defined over a set {v|v=x+td,0<t<tmax} and for every ϵ>0, there exists δ>0 such that

|f(x+td)f(x)tf(x;d)|<ϵ whenever 0<t<δ.

Since xintS, hence there exists r>0 such that B(x,r)S.

With vB(x,r), we need td<r. Thus, a tmax=rd is a suitable range of allowed values for t. Accordingly, 0<δ<tmax can be chosen.

Remark 9.14 (Directional derivative for zero vector)

If d=0 then, f(x;d)=0.

We can see this from the fact that

f(x;0)=limα0f(x+α0)f(x)α=0.

A useful result is for computing the directional derivative of a function which is the pointwise maximum of a finite number of proper functions.

We recall from Theorem 3.22 that the interior of a finite intersection of sets is the intersection of their interiors. This is useful in identifying the interior of the domain for a pointwise maximum of a finite set of functions.

9.15.3. Differentiability#

9.15.3.1. Differentiability of Proper Functions#

Definition 9.71 (Differentiability of proper functions)

Let f:V(,] be a proper function with S=domf. Let xintS. f is said to be differentiable at x if there exists gV such that:

(9.5)#limh0f(x+h)f(x)h,gh=0.

The unique vector g satisfying this condition is called the gradient of f at x and is denoted by f(x).

If f is differentiable at some xintS, then there is a simple formula to connect the gradient and the directional derivatives.

9.15.3.2. Gradient and Directional Derivatives#

Theorem 9.203 (Gradient and directional derivatives)

Let f:V(,] be a proper function with S=domf. Let xintS. Assume that f is differentiable at x. Then, for any dV,

f(x;d)=d,f(x).

In other words, the directional derivative is the projection of the gradient in the specified direction.

Proof. For d=0, the equality is obvious. We shall consider the case where d0.

  1. Since f is differentiable at x, hence

limh0f(x+h)f(x)h,f(x)h=0.
  1. In particular, if we take the limit of h along the direction of d as td where t>0 and t0+, then

    limt0+f(x+td)f(x)td,f(x)td=0.
  2. Splitting the terms, we get

    limt0+f(x+td)f(x)tdlimt0+d,f(x)d=0.
  3. Multiplying with d and simplifying, we get:

    limt0+f(x+td)f(x)tlimt0+d,f(x)=0.
  4. Thus,

    f(x;d)=limt0+f(x+td)f(x)t=d,f(x).

9.15.3.3. Gradient in Rn#

Remark 9.15 (Gradient in Rn)

It is imperative to compare the definition of gradients in this section with Definition 5.1 (differentiability of functions from Rn to Rm) and the notion of the gradient as defined in Definition 5.4.

To better develop our understanding of gradients, let us examine the gradient in the Euclidean space Rn. The standard basis is given by B={e1,,en} which are the coordinate unit vectors. The standard inner product is given by the dot product

x,y=yTx=xTy.

A vector xRn is written as

x=i=1nxiei.

The individual coordinates are obtained via

xi=ei,x=x,ei=xTeii[1,,n].

Let f:Rn(,] be a proper function. Let S=domf. Let xintS. Assume that f is differentiable at x. Let g=f(x). Let

g=ingiei.

Following the notation in Observation 5.1, the derivative of f at x, denoted by Df(x) is given by

limh0f(x+h)f(x)Df(x)h2h2=0.

We don’t have to check for x+hdomf as f is a proper function with a value of at points outside its effective domain.

Compare this with (9.5). For f:RnR, Df(x) is a row vector. If we let g~=Df(x)T, then

Df(x)h=g~Th=h,g~.

Then, the definition of g~ in the limit above is exactly the same as g in (9.5). Thus, g~=g. We can see that our definition of gradient coincides with the definition in Definition 5.4 for Rn with the dot product as standard inner product.

Now consider the components of g.

gi=ei,g=ei,f(x).

By Theorem 9.203, the directional derivative in the direction ei is given by

f(x;ei)=ei,f(x)=ei,g=gi.

Thus,

f(x)xi=ei,f(x).

The partial derivatives of f at x along the standard basis vectors are identical to the directional derivatives of f.

f(x)=[f(x)x1f(x)xn]=[e1,f(x)en,f(x)].

Then, for an arbitrary direction d=i=1nei, the directional derivative becomes

f(x;d)=d,f(x)=f(x)Td=i=1nf(x)xidi.

Recall from Definition 9.70, that the directional derivative is independent on the choice of the inner product. This is also clear from the expression i=1nf(x)xidi as the partial derivatives are independent of the choice of the inner product.

However, this means that the gradient itself must depend on the choice of inner product. If ,a and ,b are two different inner products defined on Rn, then the gradients of f at x w.r.t. the two inner products, denoted by af(x) and bf(x) must satisfy the relationship

f(x;d)=d,af(x)a=d,bf(x)bdV.

In the following, we shall assume that f(x) denotes the gradient w.r.t. the dot product.

Consider the inner product given by

x,yH=xTHy

where HRn×n is a symmetric positive definite matrix.

Then,

(Hf(x))i=Hf(x)Teicoordinate in standard basis=Hf(x)T(HH1)eiH is invertible=Hf(x)TH(H1ei)=H1ei,Hf(x)Hby definition of this inner product=f(x;H1ei)directional derivative w.r.t. this inner product=f(x)TH1eidirectional derivative w.r.t. dot product=(H1f(x))TeiH is symmetric.

Thus,

Hf(x)=H1f(x).

Thus, the gradient w.r.t. the inner product ,H is the scaled version of the standard gradient where the scaling factor is H1.

9.15.3.4. Gradient in Rm×n#

Remark 9.16 (Gradient in Rm×n)

We next look at the vector space of real matrices. The standard basis is a family of unit matrices {Eij}1im,1jn where Eij has the (i,j)-th entry as 1 and other entries as 0.

The standard inner product is given by

X,Y=tr(YTX)X,YRm×n.

Let f:Rm×nR be a proper function. Let S=domf. Let XintS. Assume that f is differentiable at X.

The gradient is given by

f(X)=(f(X)xij)ij.

The directional derivative for some direction DRm×n is given by

f(X;D)=D,f(X)=tr(f(X)TD).

Consider the inner product given by

X,YH=tr(XTHY)

where HRm×m is a symmetric positive definite matrix.

Then,

(Hf(X))ij=tr(Hf(X)TEij)coordinate in standard basis=tr(Hf(X)T(HH1)Eij)H is invertible=tr(Hf(X)TH(H1Eij))=H1Eij,Hf(X)Hby definition of this inner product=f(X;H1Eij)directional derivative w.r.t. this inner product=tr(f(X)TH1Eij)directional derivative w.r.t. standard inner product=(H1f(X))TEijH is symmetric.

Thus,

Hf(X)=H1f(X).

9.15.4. Proper Convex Functions#

9.15.4.1. Existence of Directional Derivatives#

An important property of directional derivatives is that if f is a proper convex function then f is directionally differentiable at every xintdomf.

Theorem 9.204 (Existence of directional derivatives for convex functions.)

Let f:V(,] be a proper convex function with S=domf. Let xintS. Then, for any dV, the directional derivative f(x;d) exists.

Proof. This is a consequence of the directional differentiability of the scalar convex functions.

  1. Define the convex function F:RR as

    F(t)=f(x+td).
  2. Let I=domF.

  3. Then I is an interval of values for which x+tyS.

  4. Since xintS, hence t=0intI.

  5. We now note that

    f(x;d)=limt0f(x+td)f(x)t=limt0F(t)F(0)t=F+(0).

    It is the right hand derivative of F at t=0.

  6. By Theorem 9.200, F+(0) exists.

  7. Hence f(x;d) exists for every xintS and every dV.

Observation 9.7 (Relation between the directional derivatives in opposite directions)

We can see that

f(x;d)=limt0f(xtd)f(x)t=limt0F(t)F(0)t=limr0F(r)F(0)r=F(0).

Hence

F(0)=f(x;d).

By Theorem 9.202

F(0)F+(0).

Hence

f(x;d)f(x;d)dV.

Observation 9.8 (Directional derivative as infimum)

Let f:V(,] be a proper convex function with S=domf. Let xintS. Then, for any dV,

f(x;d)=inft>0f(x+td)f(x)t.

This follows from the fact that F(t)=f(x+td) is convex and due to Observation 9.6,

f(x;d)=F+(0)=inft>0F(t)F(0)t=inft>0f(x+td)f(x)t.

9.15.4.2. Upper Semicontinuity#

The next result generalizes the upper semicontinuity property of the right hand derivatives of real convex functions.

Theorem 9.205

Let f:V(,] be a proper convex function with domf=S. Assume that S is an open subset of V. Let {fk} be a sequence of proper convex functions fk:V(,] with domfk=S with the property that

limkfk(xk)=f(x)

holds true for every xS and every sequence {xk} of S that converges to x. Then for any xS and any direction dV and any sequences {xk} of S and {dk} of V converging to x and d respectively, we have

lim supkfk(xk;dk)f(x;d).

Furthermore if f is differentiable over S, then f is also continuously differentiable over S.

Proof. Limit superior

  1. Choose any ϵ>0.

  2. By definition of the directional derivative, there exists a t>0 such that

    f(x+td)f(x)t<f(x;d)+ϵ.
  3. Due to Observation 9.8, for every k and every t>0, we have

    fk(xk;dk)fk(xk+tdk)f(xk)t.
  4. Now

    limkfk(xk+tdk)f(xk)t=f(x+td)f(x)t.
  5. Hence for sufficiently large k, we have

    fk(xk;dk)fk(xk+tdk)f(xk)t<f(x;d)+ϵ.
  6. By taking the limit superior on the L.H.S. as k, we have

    lim supkfk(xk;dk)f(x;d)+ϵ.
  7. Since this is valid for every ϵ>0, hence we must have

    lim supkfk(xk;dk)f(x;d)

    as desired.

Continuous differentiability

  1. We are given that f is differentiable over S.

  2. Then f is also continuous over S.

  3. Let xS.

  4. Let {xk} be a sequence of S converging to x.

  5. Let dV be any nonzero direction.

  6. Due to Theorem 9.203, for every k,

    f(xk;d)=d,f(xk).
  7. Hence

    lim supkd,f(xk)=lim supkf(xk;d)f(x;d)=d,f(x).
  8. By replacing d with d in the previous argument, we have

    lim infkd,f(xk)=lim supkd,f(xk)=lim supkf(xk;d)f(x;d)=d,f(x).
  9. Hence

    lim infkd,f(xk)d,f(x).
  10. Thus we have

    lim supkd,f(xk)lim infkd,f(xk).
  11. But then this must be an equality. Hence

    limkd,f(xk)=d,f(x).
  12. Since this is valid for every nonzero direction dV, hence we must have

    limkf(xk)=f(x).
  13. Hence f is continuous at every xS.

  14. Hence f is continuously differentiable at every xS.

9.15.4.3. Directional Derivatives Map#

The existence of directional derivatives in all directions allows us to consider a mapping from a direction dV to the directional derivative of f in this direction at x. We can define a directional derivative map parameterized by xS as:

gx(d)f(x;d)=limα0f(x+αd)f(x)α.

We shall refer to such maps by df(x;d).

Theorem 9.206 (Convexity and homogeneity of df(x;d))

Let f:V(,] be a proper convex function with S=domf. Let xintS. Then, the function df(x;d) is convex and nonnegative homogeneous.

Nonnegative homogeneity: For any t0 and dV,

f(x;td)=tf(x;d).

Proof. Convexity

  1. Let d1,d2V and t(0,1).

  2. Let d=td1+(1t)d2.

  3. Then,

    f(x;d)=f(x;td1+(1t)d2)=limα0f(x+α[td1+(1t)d2])f(x)α=limα0f(tx+αtd1+(1t)x+α(1t)d2)f(x)α=limα0f(t(x+αd1)+(1t)(x+αd2))f(x)αlimα0tf(x+αd1)+(1t)f(x+αd2)tf(x)(1t)f(x)α=tlimα0f(x+αd1)f(x)α+(1t)limα0f(x+αd2)f(x)α=tf(x;d1)+(1t)f(x;d2).

    We used the convexity property of f in this derivation.

  4. Thus, f(x;d) is convex.

Nonnegative homogeneity

  1. For t=0,

    f(x,0d)=f(x,0)=0=0f(x;d).

    Thus, the homogeneity property is trivial for t=0.

  2. Now consider t>0.

  3. Then,

    f(x;td)=limα0f(x+αtd)f(x)α=tlimα0f(x+αtd)f(x)αt=tf(x;d).
  4. Thus, f(x;d) is nonnegative homogeneous.

9.15.4.4. As Linear Underestimator#

Directional derivatives are a linear underestimator for convex functions.

Theorem 9.207 (Directional derivative as linear underestimator)

Let f:V(,] be a proper convex function with S=domf. Let xintS. Then, for every yS

f(y)f(x)+f(x;yx).

Proof. Note that

f(x;yx)=limα0f(x+α(yx))f(x)α=limα0f((1α)x+αy)f(x)αlimα0(1α)f(x)+αf(y)f(x)α=limα0α(f(y)f(x))α=limα0(f(y)f(x))=f(y)f(x).

Thus,

f(y)f(x)+f(x;yx).

9.15.5. Pointwise Maximum of Finite Set of Functions#

9.15.5.1. Directional Derivative#

Theorem 9.208 (Directional derivative of a maximum of functions)

Let f1,f2,,fm:V(,] be proper functions. Let f:V(,] be defined as

f(x)=max{f1(x),,fm(x)}

with domf=i=1mdomfi.

Let xintdomf=i=1mintdomfi and dV. Assume that f(x;d) exists for every i1,,m.

Let I(x)={i1,,m|fi(x)=f(x)} be the set of indices of functions whose value at x equals f(x). Then,

f(x;d)=maxiI(x)f(x;d).

In other words, the directional derivative of a pointwise maximum of functions equals the maximum of directional directives of functions which attain the pointwise maximum at a specific point.

Proof. The key idea here is that for computing the directional derivative f(x;d), only those functions are relevant for which fi(x)=f(x). We need to show this first.

  1. Since xintdomf, there exists B(x,r) such that f and fi are all defined over this open ball.

  2. Let s=rd.

  3. For every i1,,m, let gi:RR be defined as

    gi(t)=fi(x+td)

    with domgi=[0,s). sd=r. Thus, x+tdB(x,r). Hence, gi are well defined.

  4. Then,

    limt0+gi(t)=limt0+fi(x+td)=limt0+[(fi(x+td)fi(x))+fi(x)]=limt0+[tfi(x+td)fi(x)t+fi(x)]=0fi(x;d)+fi(x)=fi(x)=gi(0).

    We used the fact that fi(x;d) exists for every fi.

  5. Thus, gi is continuous from the right at t=0 for every i1,,m.

  6. Let iI(x) and jI(x).

  7. Then, fi(x)>fj(x). Alternatively gi(0)>gj(0).

  8. Since gi,gj are continuous from the right, hence there exists ϵij>0 such that gi(t)>gj(t) for every t[0,ϵij].

  9. Minimizing ϵij over all pairs of iI(x) and jI(x), there exists ϵ>0 such that for any iI(x) and jI(x),

    fi(x+td)=gi(t)>gj(t)=fj(x+td)t[0,ϵ].

We can now compute the directional derivative.

  1. For every t[0,ϵ],

    f(x+td)=maxi=1,,mfi(x+td)=maxiI(x)fi(x+td).
  2. Consequently, for any t(0,ϵ]

    f(x+td)f(x)t=maxiI(x)fi(x+td)f(x)t=maxiI(x)(fi(x+td)fi(x))t=maxiI(x)fi(x+td)fi(x)t.

    We used the fact that fi(x)=f(x) for every iI(x).

  3. Taking the limit t0,

    f(x;d)=limt0f(x+td)f(x)t=limt0maxiI(x)fi(x+td)fi(x)t=maxiI(x)limt0fi(x+td)fi(x)t=maxiI(x)fi(x;d).

9.15.5.2. Finite Set of Convex Functions Case#

Theorem 9.209 (Directional derivative of pointwise maximum of convex functions)

Let f1,f2,,fm:V(,] be proper convex functions. Let f:V(,] be defined as

f(x)=max{f1(x),,fm(x)}

with domf=i=1mdomfi.

Let xintdomf and dV. Then,

f(x;d)=maxiI(x)f(x;d).

where I(x)={i1,,m|fi(x)=f(x)}.

Proof. Since fi are proper convex, hence their pointwise maximum f is proper convex.

By Theorem 9.204, the directional derivatives f(x;d) and fi(x;d) for i=1,,m exist.

By Theorem 9.208,

f(x;d)=maxiI(x)f(x;d).

where I(x)={i1,,m|fi(x)=f(x)}.