4.3. Linear Transformations#

In this section, we will be using symbols V and W to represent arbitrary vector spaces over a field F. Unless otherwise specified, the two vector spaces won’t be related in any way. Following results can be restated for more general situations where V and W are defined over different fields, but we will assume that they are defined over the same field F for simplicity of discourse.

4.3.1. Operators#

Operators are mappings from one vector space to another space. Normally, they are total functions.

In this section, we introduce different types of operators between vector spaces. Some operators are relevant only for real vector spaces.

Definition 4.44 (Homogeneous operator)

Let V and W be vector spaces (over some field F). An operator T:VW is called homogeneous if for every xV and for every λF

T(λx)=λT(x).

Definition 4.45 (Positively homogeneous operator)

Let V and W be real vector spaces (on field R). An operator T:VW is called positively homogeneous if for every xV and for every λR++

T(λx)=λT(x).

Definition 4.46 (Additive operator)

Let V and W be vector spaces. An operator T:VW is called additive if for every x,yV

T(x+y)=T(x)+T(y).

4.3.2. Linear Transformations#

A linear operator is additive and homogeneous.

Definition 4.47 (Linear transformation)

We call a map T:VW a linear transformation from V to W if for all x,yV and αF, we have

  1. T(x+y)=T(x)+T(y) and

  2. T(αx)=αT(x)

A linear transformation is also known as a linear map or a linear operator.

4.3.3. Properties#

Proposition 4.3 (Zero maps to zero)

If T is linear then T(0)=0.

This is straightforward since

T(0+0)=T(0)+T(0)T(0)=T(0)+T(0)T(0)=0.

Proposition 4.4

T is linear T(αx+y)=αT(x)+T(y)x,yV,αF

Proof. Assuming T to be linear we have

T(αx+y)=T(αx)+T(y)=αT(x)+T(y).

Now for the converse, assume

T(αx+y)=αT(x)+T(y)x,yV,αF.

Choosing both x and y to be \bzero and α=1 we get

T(0+0)=T(0)+T(0)T(0)=0.

Choosing y=0 we get

T(αx+0)=αT(x)+T(0)=αT(x).

Choosing α=1 we get

T(x+y)=T(x)+T(y).

Thus, T is a linear transformation.

Proposition 4.5

If T is linear then T(xy)=T(x)T(y).

T(xy)=T(x+(1)y)=T(x)+T((1)y)=T(x)+(1)T(y)=T(x)T(y).

Proposition 4.6 (Linear transformation preserves linear combinations)

T is linear for x1,,xnV and α1,,αnF,

T(i=1nαixi)=i=1nαiT(xi).

We can use mathematical induction to prove this.

Some special linear transformations need mention.

Definition 4.48 (Identity transformation)

The identity transformation IV:VV is defined as

IV(x)=x,xV.

Definition 4.49

The zero transformation O:VW is defined as

O(x)=0,xV.

Note that 0 on the R.H.S. is the zero vector or W.

In this definition 0 is taking up multiple meanings: a linear transformation from V to W which maps every vector in V to the 0 vector in W.

From the context usually it should be obvious whether we are talking about 0F or 0V or 0W or O as a linear transformation from V to W.

4.3.4. Null Space and Range#

Definition 4.50 (Null space / Kernel)

The null space or kernel of a linear transformation T:VW denoted by N(T) or ker(T) is defined as

ker(T)=N(T){xV|T(x)=0}.

Theorem 4.23

The null space of a linear transformation T:VW is a subspace of V.

Proof. Let v1,v2ker(T). Then

T(αv1+v2)=αT(v1)+T(v2)=α0+0=0.

Thus αv1+v2ker(T). Thus ker(T) is a subspace of V.

Definition 4.51

The range or image of a linear transformation T:VW denoted by R(T) or im(T) is defined as

R(T)=im(T){T(x)xV}.

We note that im(T)W.

Theorem 4.24

The image of a linear transformation T:VW is a subspace of W.

Proof. Let w1,w2im(T). Then there exist v1,v2V such that

w1=T(v1);w2=T(v2).

Thus

αw1+w2=αT(v1)+T(v2)=T(αv1+v2).

Thus αw1+w2im(T). Hence im(T) is a subspace of W.

Theorem 4.25

Let T:VW be a linear transformation. Assume V to be finite dimensional. Let B={v1,v2,,vn} be some basis of V. Then

im(T)=spanT(B)=span{T(v1),T(v2),,T(vn)}.

i.e., the image of a basis of V under a linear transformation T spans the range of the transformation.

Proof. Let w be some arbitrary vector in im(T). Then there exists vV such that w=T(v). Now

v=i=1ncivi

since B forms a basis for V. Thus,

w=T(v)=T(i=1ncivi)=i=1nci(T(vi)).

This means that wspanT(B).

Definition 4.52 (Nullity)

For vector spaces V and W and linear transformation T:VW if kerT is finite dimensional then nullity of T is defined as

nullityTdimkerT;

i.e., the dimension of the null space or kernel of T.

Definition 4.53

For vector spaces V and W and linear T:VW, if rangeT is finite dimensional then rank of T is defined as

rankTdimrangeT;

i.e., the dimension of the range or image of T.

Theorem 4.26 (Dimension theorem)

For vector spaces V and W and linear T:VW if V is finite dimensional, then

dimV=nullityT+rankT.

This is known as dimension theorem.

Theorem 4.27

For vector spaces V and W and linear T:VW, T is injective if and only if ker(T)={0}.

Proof. If T is injective, then

v1v2T(v1)T(v2)

Let v0. Now T(0)=0T(v)0 since T is one-one. Thus kerT={0}.

For the converse, let us assume that kerT={0}. Let v1,v2V be two vectors such that they have the same image. Then,

T(v1)=T(v2)T(v1v2)=0v1v2kerTv1v2=0v1=v2.

Thus T is injective.

Theorem 4.28 (Bijective transformation characterization)

For vector spaces V and W of equal finite dimensions and linear T:VW, the following are equivalent.

  1. T is injective.

  2. T is surjective.

  3. rankT=dimV.

Proof. From (1) to (2)

Let B={v1,v2,vn} be some basis of V with dimV=n.

Let us assume that T(B) are linearly dependent. Thus, there exists a linear relationship

i=1nαiT(vi)=0

where αi are not all 0. Now

i=1nαiT(vi)=0T(i=1nαivi)=0i=1nαivikerTi=1nαivi=0

since T is injective (see Theorem 4.27). This means that vi are linearly dependent. This contradicts our assumption that B is a basis for V.

Thus T(B) are linearly independent.

Since T is injective, hence all vectors in T(B) are distinct, hence

|T(B)|=n.

Since T(B) span imT and are linearly independent, hence they form a basis of imT.

But

dimV=dimW=n

and T(B) are a set of n linearly independent vectors in W.

Hence, T(B) form a basis of W. Thus

imT=spanT(B)=W.

Thus T is surjective.

From (2) to (3)

T is surjective means imT=W. Thus

rankT=dimW=dimV.

From (3) to (1)

We know that

dimV=rankT+nullityT.

But, it is given that rankT=dimV. Thus

nullityT=0.

Thus T is injective (due to Theorem 4.27).

4.3.5. Bracket Operator#

Recall the definition of coordinate vector from Definition 4.19. Conversion of a given vector to its coordinate vector representation can be shown to be a linear transformation.

Definition 4.54 (Bracket operator)

Let V be a finite dimensional vector space over a field F where dimV=n. Let B={v1,,vn} be an ordered basis in V. We define a bracket operator from V to Fn as

[]B:VFnx[x]B[α1αn]

where

x=i=1nαivi

is the unique representation of x in B.

In other words, the bracket operator takes a vector x from a finite dimensional space V to its representation in Fn for a given basis B.

We now show that the bracket operator is linear.

Theorem 4.29 (Bracket operator is linear and bijective)

Let V be a finite dimensional vector space over a field F where dimV=n. Let B={v1,,vn} be an ordered basis in V. The bracket operator []B:VFn as defined in Definition 4.54 is a linear operator.

Moreover []B is a bijective mapping.

Proof. Let x,yV such that

x=i=1nαivi

and

y=i=1nβivi.

Then

cx+y=ci=1nαivi+i=1nβivi=i=1n(cαi+βi)vi.

Thus,

[cx+y]B=[cα1+β1cαn+βn]=c[α1αn]+[β1βn]=c[x]B+[y]B.

Thus []B is linear.

We can see that by definition []B is injective. Now since dimV=n=dimFn hence []B is surjective due to Theorem 4.28.

4.3.6. Matrix Representations#

It is much easier to work with a matrix representation of a linear transformation. In this section we describe how matrix representations of a linear transformation are developed.

In order to develop a representation for the map T:VW we first need to choose a representation for vectors in V and W. This can be easily done by choosing a basis in V and another in W. Once the bases are chosen, then we can represent vectors as coordinate vectors.

Definition 4.55 (Matrix representation of a linear transformation)

Let V and W be finite dimensional vector spaces with ordered bases B={v1,,vn} and Γ={w1,,wm} respectively. Let T:VW be a linear transformation. For each vjB we can find a unique representation for T(vj) in Γ given by

T(vj)=i=1maijwi1jn.

The m×n matrix A defined by Aij=aij is the matrix representation of T in the ordered bases B and Γ, denoted as

A=[T]BΓ.

If V=W and B=Γ then we write

A=[T]B.

The j-th column of A is the representation of T(vj) in Γ.

In order to justify the matrix representation of T we need to show that application of T is same as multiplication by A. This is stated formally below.

Theorem 4.30 (Justification of matrix representation)

[T(v)]Γ=[T]BΓ[v]BvV.

Proof. Let

v=j=1ncjvj.

Then

[v]B=[c1cn]

Now

T(v)=T(j=1ncjvj)=j=1ncjT(vj)=j=1ncji=1maijwi=i=1m(j=1naijcj)wi

Thus

[T(v)]Γ=[j=1na1jcjj=1namjcj]=A[c1cn]=[T]BΓ[v]B.

4.3.7. Vector Space of Linear Transformations#

If we consider the set of linear transformations from V to W we can impose some structure on it and take its advantages.

First of all we will define basic operations like addition and scalar multiplication on the general set of mappings from a vector space V to another vector space W.

Definition 4.56 (Addition and scalar multiplication on mappings)

Let T and U be arbitrary mappings from vector space V to vector space W over the field F. Then addition of mappings is defined as

(T+U)(v)=T(v)+U(v)vV.

Scalar multiplication on a mapping is defined as

(αT)(v)=α(T(v))αF,vV.

With these definitions we have

(αT+U)(v)=(αT)(v)+U(v)=α(T(v))+U(v).

We are now ready to show that with the addition and scalar multiplication as defined above, the set of linear transformations from V to W actually forms a vector space.

Theorem 4.31 (Linear transformations form a vector space)

Let V and W be vector spaces over field F. Let T and U be some linear transformations from V to W. Let addition and scalar multiplication of linear transformations be defined as in Definition 4.56. Then αT+U where αF is a linear transformation.

Moreover, the set of linear transformations from V to W forms a vector space.

Proof. We first show that αT+U is linear.

Let x,yV and βF. Then we need to show that

(αT+U)(x+y)=(αT+U)(x)+(αT+U)(y)(αT+U)(βx)=β((αT+U)(x)).

Starting with the first one:

(αT+U)(x+y)=(αT)(x+y)+U(x+y)=α(T(x+y))+U(x)+U(y)=αT(x)+αT(y)+U(x)+U(y)=(αT)(x)+U(x)+(αT)(y)+U(y)=(αT+U)(x)+(αT+U)(y).

Now the next one

(αT+U)(βx)=(αT)(βx)+U(βx)=α(T(βx))+β(U(x))=α(β(T(x)))+β(U(x))=β(α(T(x)))+β(U(x))=β((αT)(x)+U(x))=β((αT+U)(x)).

We can now easily verify that the set of linear transformations from V to W satisfies all the requirements of a vector space. Hence it is a vector space (of linear transformations from V to W).

Definition 4.57 (The vector space of linear transformations)

Let V and W be vector spaces over field F. Then the vector space of linear transformations from V to W is denoted by L(V,W).

When V=W then it is simply denoted by L(V).

The addition and scalar multiplication as defined in Definition 4.56 carries forward to matrix representations of linear transformations also.

Theorem 4.32

Let V and W be finite dimensional vector spaces over field F with B and Γ being their respective bases. Let T and U be some linear transformations from V to W.

Then, the following hold

  1. [T+U]BΓ=[T]BΓ+[U]BΓ.

  2. [αT]BΓ=α[T]BΓαF.

4.3.8. Projections#

Definition 4.58

A projection is a linear transformation P from a vector space V to itself such that P2=P; i.e., if Pv=x, then Px=x.

Remark 4.4

Whenever P is applied twice (or more) to any vector, it gives the same result as if it was applied once.

Thus, P is an idempotent operator.

Example 4.14 (Projection operators)

Consider the operator P:R3R3 defined as

P=[100010000].

Then application of P on any arbitrary vector is given by

P[xyz]=[xy0]

A second application doesn’t change it

P[xy0]=[xy0]

Thus P is a projection operator.

Often, we can directly verify the property by computing P2 as

P2=[100010000][100010000]=[100010000]=P.