4.14. Affine Sets and Transformations#

Primary references for this section are [9, 17, 67].

In this section \(\VV\) denotes a vector space on some field \(\FF\) which can be either \(\RR\) (real numbers) or \(\CC\) (complex numbers). Much of the section will not require any other structure on the vector space.

Some results in this section are applicable for normed linear spaces or inner product spaces. We shall assume that \(\VV\) is endowed with an appropriate norm \(\| \cdot \| : \VV \to \RR\) or an inner product \(\langle \cdot, \cdot \rangle : \VV \times \VV \to \FF\) wherever applicable.

Note

The notion of lines in a complex vector space may sound very confusing as a complex line is topologically equivalent to a real plane, not a real line. If you are getting lost while reading this section, just think of \(\FF\) as \(\RR\) and visualize everything in a real vector space. The algebraic presentation of affine sets and spaces is equally valid for complex vector spaces.

A key property of \(\RR\) is that \(\RR\) is totally ordered. Hence, the scalars from \(\RR\) can be compared. There is no natural order in \(\CC\), the field of complex numbers. As you study this section, you will notice that scalar comparison is never needed in the treatment of affine sets, subspaces and transformations in this section.

4.14.1. Lines#

Definition 4.158 (Line)

Let \(\bx_1\) and \(\bx_2\) be two points in \(\VV\). Points of the form

\[ \by = \theta \bx_1 + (1 - \theta) \bx_2 \text{ where } \theta \in \FF \]

form a line passing through \(\bx_1\) and \(\bx_2\).

  • at \(\theta=0\) we have \(\by=\bx_2\).

  • at \(\theta=1\) we have \(\by=\bx_1\).

We can also rewrite \(\by\) as

\[ \by = \bx_2 + \theta (\bx_1 - \bx_2) \Forall \theta \in \FF. \]

In this definition:

  • \(\bx_2\) is called the base point for this line.

  • \(\bx_1 - \bx_2\) defines the direction of the line.

  • \(\by\) is the sum of the base point and the direction scaled by the parameter \(\theta\).

  • As \(\theta\) goes from \(0\) to \(1\), \(\by\) moves from \(\bx_2\) to \(\bx_1\).

Remark 4.29

An alternative notation for the line as a set is \(\bx_2 + \FF (\bx_1 - \bx_2)\) following the notation in Definition 4.25.

4.14.2. Affine Sets#

Definition 4.159 (Affine set)

A set \(C \subseteq \VV\) is affine if the line through any two distinct points in \(C\) lies in \(C\).

In other words, for any \(\bx_1, \bx_2 \in C\), we have \(\theta \bx_1 + (1 - \theta) \bx_2 \in C\) for all \(\theta \in \FF\).

Another way to write this is:

\[ \Forall \theta \in \FF, \quad C = \theta C + (1 - \theta) C. \]

Different authors use other names for affine sets like “affine manifold”, “affine variety”, “linear variety” or “flat”.

Example 4.29

The empty set \(\EmptySet\) is affine vacuously as it contains no points. Hence, every line passing through the points in \(\EmptySet\) is inside it vacuously.

Example 4.30

For any \(\bx \in \VV\), the singleton set \(\{ \bx \}\) is affine vacuously. It contains only one point. Hence, every line passing through two distinct points in \(\{ \bx \}\) is inside it vacuously.

In fact:

\[ \theta \bx + (1 - \theta) \bx = \bx \Forall \theta \in \FF. \]

Example 4.31

Any line in \(\VV\) is an affine set.

Example 4.32

Any vector space \(\VV\) is affine. It is so since a vector space is closed under vector addition and scalar multiplication. Hence, for any two points in the vector space, the line passing through it is contained inside the space.

Theorem 4.164 (Linear subspaces are affine)

The linear subspaces of a vector space \(\VV\) are affine sets containing the zero vector.

Proof. Let \(\WW\) be a linear subspace of \(\VV\).

  1. Then \(\WW\) contains \(\bzero\).

  2. Let \(\bx, \by \in \WW\).

  3. Then, by linearity, any \(\alpha \bx + \beta \by \in \WW\).

  4. In particular, for some \(\theta \in \FF\), \(\theta \bx + (1 - \theta)\bw \in \WW\) holds too.

  5. Thus, \(\WW\) is affine.

For the converse, let \(A\) be an affine set containing \(\bzero\).

  1. For any \(\bx \in A\) and \(t \in \FF\),

    \[ t \bx = (1 - t) \bzero + t \bx \in A \]

    since \(A\) is affine. Thus, \(A\) is closed under scalar multiplication.

  2. Let \(\bx, \by \in A\). Since \(A\) is affine, hence

    \[ \frac{1}{2} (\bx + \by) = \frac{1}{2} \bx + \left (1 - \frac{1}{2} \right) \by \in A. \]
  3. But then, \(\bx + \by \in A\) holds too since \(A\) is closed under scalar multiplication.

  4. Thus, \(A\) is closed under vector addition.

  5. Since \(A\) is closed under scalar multiplication and vector addition, hence \(A\) must be a subspace.

4.14.3. Affine Combinations#

If we denote \(\alpha = \theta\) and \(\beta = (1 - \theta)\) we see that \(\alpha \bx_1 + \beta \bx_2\) represents a linear combination of points in \(C\) such that \(\alpha + \beta = 1\). The idea can be generalized in following way.

Definition 4.160 (Affine combination)

A point of the form \(\bx = \theta_1 \bx_1 + \dots + \theta_k \bx_k\) where \(\theta_1 + \dots + \theta_k = 1\) with \(\theta_i \in \FF\) and \(\bx_i \in \VV\), is called an affine combination of the points \(\bx_1,\dots,\bx_k\).

Note that the definition only considers finite number of terms in the affine combination.

It can be shown easily that an affine set \(C\) contains all affine combinations of its points.

Theorem 4.165 (Affine set contains affine combinations)

If \(C\) is an affine set, \(\bx_1, \dots, \bx_k \in C\), and \(\theta_1 + \dots + \theta_k = 1\), then the point \(\bx = \theta_1 \bx_1 + \dots + \theta_k \bx_k\) also belongs to \(C\).

Proof. We shall call \(\theta_1 \bx_1 + \dots + \theta_k \bx_k = \sum_{i=1}^k \theta_i \bx_i\) with \(\sum_{i=1}^k \theta_i = 1\) as \(k\) term affine combinations.

Our proof strategy is as follows:

  1. We show that an affine set contains all its 2 term affine combinations.

  2. We then show that if an affine set contains all its \(k-1\) term affine combinations then it must contain all its \(k\) term affine combinations.

  3. Thus, by principle of mathematical induction, it contains all its affine combinations.

An affine combination of two points is of the form \(\theta_1 \bx_1 + \theta_2 \bx_2\) where \(\theta_1 + \theta_2 = 1\). By definition an affine set contains all its 2 term affine combinations.

Now, assume that \(C\) contains all its \(k-1\) term affine combinations.

  1. Consider points \(\bx_1, \dots, \bx_{k-1}, \bx_k \in C\).

  2. Let \(\theta_1, \dots, \theta_{k-1}, \theta_k \in \FF\) such that \(\theta_1 + \dots + \theta_{k-1} + \theta_k = 1\).

  3. Without loss of generality, assume that \(\theta_k \neq 1\). Thus, \(1 - \theta_k \neq 0\).

  4. Note that \(\theta_1 + \dots + \theta_{k-1} = 1 - \theta_k\).

  5. Thus, \(\frac{\theta_1}{1 - \theta_k} + \dots + \frac{\theta_{k-1}}{1- \theta_k} = 1\).

  6. We can then write:

    \[\begin{split} \begin{aligned} \bx &= \sum_{i=1}^k \theta_i \bx_i = \sum_{i=1}^{k-1} \theta_i \bx_i + \theta_k \bx_k \\ &= (1 - \theta_k) \sum_{i=1}^{k-1} \frac{\theta_i}{1 - \theta_k} \bx_i + \theta_k \bx_k. \end{aligned} \end{split}\]
  7. Note that the term \(\by = \sum_{i=1}^{k-1} \frac{\theta_i}{1 - \theta_k} \bx_i\) is an affine combination of \(k-1\) terms.

  8. Thus, by inductive hypothesis, \(\by \in C\).

  9. We are left with

    \[ \bx = (1 - \theta_k) \by + \theta_k \bx_k. \]
  10. This is a two term affine combination. Since \(\by, \bx_k \in C\), hence \(\bx \in C\).

  11. Thus, we established that if \(C\) contains its \(k-1\) term affine combinations, it contains its \(k\) term affine combinations too.

Theorem 4.166

An affine combination of affine combinations is an affine combination.

Proof. Let \(\bu = \sum_{i=1}^k \theta_i \bx_i\) and \(\bv = \sum_{j=1}^l \lambda_j \by_j\) where \(\bx_i , \by_j \in \VV\) and \(\sum \theta_i = 1\) and \(\sum \lambda_j = 1\).

We claim that \(\bw = \gamma \bu + ( 1 - \gamma) \bv\) is also an affine combination.

\[\begin{split} \begin{aligned} \bw &= \gamma \bu + ( 1 - \gamma) \bv \\ &= \gamma \sum_{i=1}^k \theta_i \bx_i + ( 1 - \gamma) \sum_{j=1}^l \lambda_j \by_j\\ &= \sum_{i=1}^k \gamma \theta_i \bx_i + \sum_{j=1}^l ( 1 - \gamma) \lambda_j \by_j. \end{aligned} \end{split}\]

Notice that:

\[\begin{split} \begin{aligned} & \sum_{i=1}^k \gamma \theta_i + \sum_{j=1}^l ( 1 - \gamma) \lambda_j\\ &= \gamma \sum_{i=1}^k \theta_i + ( 1 - \gamma) \sum_{j=1}^l \lambda_j\\ &= \gamma 1 + (1 - \gamma)1 = 1. \end{aligned} \end{split}\]

Thus, \(\bw\) is an affine combination of the points \(\bx_i\) and \(\by_j\).

We can use the mathematical induction to show that arbitrary affine combinations of affine combinations are affine combinations.

4.14.4. Connection with Linear Subspaces#

Theorem 4.167 (affine - point = linear)

Let \(C\) be a nonempty affine set and \(\bx_0\) be any element in \(C\). Then the set

\[ V = C - \bx_0 = \{ \bx - \bx_0 | \bx \in C\} \]

is a linear subspace of \(\VV\).

To show that \(V\) is indeed a linear subspace, we can show that every linear combination of two arbitrary elements in \(V\) belongs to \(V\).

Proof. Let \(\bv_1\) and \(\bv_2\) be two elements in \(V\). Then by definition, there exist \(\bx_1\) and \(\bx_2\) in \(C\) such that

\[ \bv_1 = \bx_1 - \bx_0 \text{ and } \bv_2 = \bx_2 - \bx_0. \]

Thus

\[ a \bv_1 + \bv_2 = a (\bx_1 - \bx_0) + \bx_2 - \bx_0 = (a \bx_1 + \bx_2 - a \bx_0 ) - \bx_0 \Forall a \in \FF. \]

But since \(a + 1 - a = 1\), hence \(\bx_3 = (a \bx_1 + \bx_2 - a \bx_0 ) \in C\) (an affine combination).

Hence \(a \bv_1 + \bv_2 = \bx_3 - \bx_0 \in V\) [by definition of \(V\)].

Thus, any linear combination of elements in \(V\) belongs to \(V\). Hence, \(V\) is a linear subspace of \(\VV\).

Observation 4.9 (affine = linear + point)

With the previous result, we can use the following notation:

\[ C = V + \bx_0 = \{ \bv + \bx_0 | \bv \in V\} \]

where \(V\) is a linear subspace of \(\VV\) and \(\bx_0 \in \VV\). In other words, a nonempty affine set is a linear subspace with an offset.

We need to justify this notation by establishing that there is one and only linear subspace associated with an affine set. This is done in the next result.

Theorem 4.168 (Uniqueness of associated subspace)

Let \(C\) be a nonempty affine set and let \(\bx_1\) and \(\bx_2\) be two distinct elements in \(C\). Let \(V_1 = C - \bx_1\) and \(V_2 = C - \bx_2\), then the linear subspaces \(V_1\) and \(V_2\) are identical.

Proof. We show that \(V_1 \subseteq V_2\) and \(V_2 \subseteq V_1\).

  1. Let \(\bv \in V_1\).

  2. There exists \(\bx \in C\) such that \(\bv = \bx - \bx_1\).

  3. Then, \(\bv = \bx - \bx_1 + \bx_2 - \bx_2\).

  4. Let \(\by = \bx - \bx_1 + \bx_2\). Note that \(\bx, \bx_1, \bx_2 \in C\) and \(\by\) is an affine combination of \(\bx, \bx_1, \bx_2\).

  5. Thus, \(\by \in C\).

  6. We can now write \(\bv = \by - \bx_2\).

  7. Thus, \(\bv \in V_2\) as \(V_2 = C - \bx_2\).

  8. Thus, \(V_1 \subseteq V_2\).

  9. An identical reasoning starting with some \(\bv \in V_2\) gives us \(V_2 \subseteq V_1\).

  10. Thus, \(V_1 = V_2\).

Thus the subspace \(V\) associated with a nonempty affine set \(C\) doesn’t depend upon the choice of offset \(\bx_0\) in \(C\).

Corollary 4.30

If an affine set contains \(\bzero\) then it is a linear subspace.

We have already shown this in Theorem 4.164. This is an alternative proof.

Proof. The linear subspace associated with an affine set \(C\) is given by \(V = C - \bx_0\) for any \(\bx_0 \in \VV\).

In particular, if \(C\) contains \(\bzero\), then

\[ V = C - \bzero = C. \]

Thus, \(C\) is a linear subspace.

4.14.5. Affine Subspaces and Dimension#

Definition 4.161 (Affine subspace)

A nonempty affine set is called an affine subspace. An affine subspace is a linear subspace with an offset.

Another way to express this is as follows. \(C\) is an affine subspace of \(\VV\) if:

\[ C \neq \EmptySet \text{ and } \Forall \theta \in \FF, \quad C = \theta C + (1 - \theta) C. \]

Definition 4.162 (Affine proper subspace)

An affine subspace \(A\) in a vector space \(\VV\) is called a proper subspace if the linear subspace associated with \(A\) is a proper subspace of \(\VV\).

In other words, \(A\) is affine, \(A \neq \EmptySet\), and \(A \neq \VV\).

Definition 4.163 (Affine dimension)

We define the affine dimension of an affine subspace \(C\) as the dimension of the associated linear subspace \(V = C - \bx_0\) for some \(\bx_0 \in C\) if the subspace \(V\) is finite dimensional.

The dimension of \(\EmptySet\) (empty affine set) is \(-1\) by convention.

The definition is consistent since \(V\) is independent of the choice of \(\bx_0 \in C\).

Example 4.33 (Singletons as affine subspaces)

For any \(\bx \in \VV\), the singleton set \(\{ \bx\}\) can be expressed as

\[ \{ \bx\} = \bx + \{ \bzero \}. \]

Its corresponding linear subspace is \(\{ \bzero \}\) of zero dimension.

Thus, the singleton set has an affine dimension of 0.

Remark 4.30 (Points, lines, planes, flats)

The affine sets of dimension 0, 1 and 2 are called points, lines and planes respectively.

An affine set of dimension \(k\) is often called a \(k\)-flat.

Example 4.34 (More affine sets)

  • The euclidean space \(\RR^n\) is affine.

  • Any line is affine. The associated linear subspace is a line parallel to it which passes through origin.

  • Any plane is affine. If it passes through origin, it is a linear subspace. The associated linear subspace is the plane parallel to it which passes through origin.

Theorem 4.169

An affine subspace is closed under affine combinations.

Proof. This is from the definition of affine sets and Theorem 4.165.

Observation 4.10 (Affine - affine = Linear)

Let \(C\) be an affine subspace. Let \(V\) be the linear subspace associated with \(C\) given by \(V = C - \bx\). Then every vector \(\bv \in V\) can be written as \(\bv = \by - \bx\) where \(\by \in C\). Since \(V\) doesn’t depend on the choice of \(\bx\), hence \(V\) is the set of all vectors of the form \(\by - \bx\) where \(\by, \bx \in C\).

Thus, following the notation in Definition 4.25, we can write \(V\) as:

\[ V = C - C. \]

One way to think of affine sets as collections of points in an arbitrary space and the associated linear subspace as the collection of difference vectors between points.

\[ \text{vector ab} = \text{point b} - \text{point a}. \]

4.14.6. Affine Hull#

Definition 4.164 (Affine hull)

The set of all affine combinations of points in some arbitrary nonempty set \(S \subseteq \VV\) is called the affine hull of \(S\) and denoted as \(\affine S\):

\[ \affine S = \{\theta_1 \bx_1 + \dots + \theta_k \bx_k \ST \bx_1, \dots, \bx_k \in S \text{ and } \theta_1 + \dots + \theta_k = 1\}. \]

Theorem 4.170

An affine hull is an affine subspace.

Proof. Let \(S \subset \VV\) be nonempty. Let \(T = \affine S\). Let \(\bu, \bv \in T\). Then

\[ \bu = \sum_{i=1}^{k} \theta_i \bx_i \text {and } v = \sum_{j=1}^{l} \lambda_j \by_j \]

where \(\bx_i, \by_j \in S\), \(\sum \theta_i = 1\) and \(\sum \lambda_j = 1\).

Then, as shown in Theorem 4.166,

\[ \bw = \gamma \bu + (1 - \gamma) \bv \]

is an affine combination of points \(\bx_i, \by_j \in S\).

Thus, \(\bw \in T\). Hence, \(T\) is an affine set. Since \(T\) is nonempty, hence \(T\) is an affine subspace.

Theorem 4.171 (Smallest containing affine subspace)

The affine hull of a nonempty set \(S\) is the smallest affine subspace containing \(S\). More specifically, let \(C\) be any affine subspace with \(S \subseteq C\). Then \(\affine S \subseteq C\).

Proof. Let \(C\) be an arbitrary affine subspace such that \(S \subseteq C\).

  1. From Theorem 4.169, \(C\) is closed under affine combinations.

  2. Thus, \(C\) contains all affine combinations of points of \(S\).

  3. Thus, \(\affine S \subseteq C\).

  4. We established in Theorem 4.170 that \(\affine S\) is an affine subspace.

  5. Thus, it is the smallest affine subspace containing \(S\).

Corollary 4.31 (Affine hull as intersection)

The affine hull of a set is the intersection of all affine subspaces containing it.

Theorem 4.172 (Affine hull of a finite set)

Let \(S = \{ \bv_0, \bv_1, \dots, \bv_k \}\) be a finite set of vectors from a vector space \(\VV\). Let \(A = \affine S\) be their affine hull. Then, the linear subspace associated with \(A\) is given by

\[ L = \span \{\bv_1 - \bv_0, \dots, \bv_k - \bv_0\}. \]

Consequently, the dimension of \(\affine S\) is at most \(k\).

Proof. Since \(L = A - \bv_0\), hence \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0 \in L\). Thus, \(\span \{ \bv_1 - \bv_0, \dots, \bv_k - \bv_0\} \subseteq L\).

Now, let \(\bv \in L\). Then, there exist \(t_0, \dots, t_k\) with \(t_0 + \dots + t_k =1\) such that

\[ \bv = t_0 \bv_0 + \dots + t_k \bv_k - \bv_0. \]

But then

\[\begin{split} \bv &= (1 - t_1 - \dots - t_k) \bv_0 + t_1 \bv_1 + \dots + t_k \bv_k - \bv_0\\ &= t_1 (\bv_1 - \bv_0) + \dots + t_k (\bv_k - \bv_0). \end{split}\]

Thus, \(\bv \in \span \{\bv_1 - \bv_0, \dots, \bv_k - \bv_0\}\). Thus, \(L \subseteq \span \{\bv_1 - \bv_0, \dots, \bv_k - \bv_0\}\).

Combining:

\[ L = \span \{\bv_1 - \bv_0, \dots, \bv_k - \bv_0\}. \]

Since \(L\) is a span of \(k\) vectors, hence \(\dim L \leq k\). Thus, \(\dim A \leq k\).

Theorem 4.173 (Containment)

If \(A \subseteq B\), then \(\affine A \subseteq \affine B\).

Proof. We proceed as follows:

  1. By definition, \(\affine B\) contains all affine combinations of points in \(B\).

  2. Thus, it contains all affine combinations of points in \(A\) since \(A \subseteq B\).

  3. But that is \(\affine A\).

  4. Thus, \(\affine A \subseteq \affine B\).

Theorem 4.174 (Tight containment)

If \(A \subseteq B \subseteq \affine A\), then \(\affine A = \affine B\).

Proof. We proceed as follows:

  1. Note that \(\affine A\) is an affine set containing \(B\).

  2. But \(\affine B\) is the smallest affine set containing \(B\), hence \(\affine B \subseteq \affine A\).

  3. But \(A \subseteq B\) implies that \(\affine A \subseteq \affine B\).

  4. Thus, \(\affine A = \affine B\).

4.14.7. Affine Independence#

Definition 4.165 (Affine independence)

A set of vectors \(\bv_0, \bv_1, \dots, \bv_k \in \VV\) is called affine independent, if the vectors \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) are linearly independent.

If the associated subspace has dimension \(l\) then a maximum of \(l\) vectors can be linearly independent in it. Hence a maximum of \(l+1\) vectors can be affine independent for the affine set.

Definition 4.166 (Affine dependence)

A set of vectors \(\bv_0, \bv_1, \dots, \bv_k \in \VV\) is called affine dependent, if it is not affine independent. In other words, the vectors \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) are linearly dependent.

Theorem 4.175 (Basis for the linear subspace associated with affine independent set)

Let \(\bv_0, \bv_1, \dots, \bv_k \in \VV\) be a set of affine independent, points in \(\VV\).

Let \(S = \{ \bv_0, \bv_1, \dots, \bv_k \}\). Let \(A = \affine S\). Let \(L\) be the linear subspace associated with \(A\).

Then, \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) form a basis for \(L\).

Proof. By definition of affine independence, \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) are linearly independent.

By Theorem 4.172

\[ L = \span \{ \bv_1 - \bv_0, \dots, \bv_k - \bv_0\}. \]

Since, they are linearly independent and span \(L\), hence they form a basis for \(L\).

Theorem 4.176 (Affine independence and dimension)

A set of vectors \(\bv_0, \bv_1, \dots, \bv_k \in \VV\) is affine independent if and only if their affine hull \(\affine \{\bv_0, \bv_1, \dots, \bv_k\}\) is \(k\) dimensional.

Proof. Assume \(\bv_0, \bv_1, \dots, \bv_k\) to be affine independent.

  1. Then, by Definition 4.165, \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) are linearly independent.

  2. Let \(L = \span \{ \bv_1 - \bv_0, \dots, \bv_k - \bv_0 \}\).

  3. By Theorem 4.172

    \[ \dim \affine \{\bv_0, \bv_1, \dots, \bv_k\} = \dim L = k \]

    since \(L\) is a span of \(k\) linearly independent vectors.

Now, assume \(A = \affine \{\bv_0, \bv_1, \dots, \bv_k\}\) is \(k\) dimensional.

  1. By Theorem 4.172, the linear subspace associated with \(A\) is given by \(L = \span \{ \bv_1 - \bv_0, \dots, \bv_k - \bv_0 \}\).

  2. Thus, \(L\) is \(k\) dimensional since \(\dim L = \dim A = k\).

  3. But, \(L\) is a span of \(k\) vectors.

  4. Hence, the \(k\) vectors \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) must be linearly independent.

[67] defines \(\bv_0, \bv_1, \dots, \bv_k \in \VV\) as affine independent if their hull is \(k\) dimensional. As we can see above, our definition is equivalent.

Theorem 4.177 (Affine independent points in an affine subspace)

Let \(A\) be an affine subspace of a vector space \(\VV\) such that \(\dim A = k\). Then, it is possible to choose a set of up to \(k+1\) points in \(A\) which are affine independent. Any set of \(k+2\) points in \(A\) is not affine independent.

Proof. Let \(L\) be the subspace associated with \(A\) and let \(\bv_0 \in A\) be some fixed point of \(A\).

  1. We have \(\dim L = \dim A = k\).

  2. Choose a basis \(\BBB = \{\bx_1, \dots, \bx_k\}\) of \(L\).

  3. Let \(\bv_1 = \bx_1 + \bv_0, \dots, \bv_k = \bx_k + \bv_0\).

  4. Then, the set of \(k+1\) points \(\bv_0, \bv_1, \dots, \bv_k\) are affine independent since \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) are linearly independent.

  5. For less than \(k+1\) points, we can choose less than \(k\) vectors from the basis \(\BBB\) and construct accordingly.

We now show that any set of \(k+2\) points cannot be affine independent.

  1. Let \(\bv_0, \bv_1, \dots, \bv_k, \bv_{k+1}\) be an arbitrary set of \(k+2\) points in \(A\).

  2. Then, \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0, \bv_{k+1} - \bv_0 \in L\) is a set of \(k+1\) points in \(L\).

  3. Since \(\dim L = k\), hence any set of \(k+1\) points in \(L\) is linearly dependent.

  4. Thus, \(\bv_0, \bv_1, \dots, \bv_k, \bv_{k+1}\) cannot be affine independent.

Theorem 4.178 (Affine set as an affine hull)

Let \(A\) be an affine subspace of a vector space \(\VV\) such that \(\dim A = k\). Let \(\{ \bv_0, \dots, \bv_k \}\) be a set of \(k+1\) affine independent points of \(A\). Then,

\[ A = \affine \{ \bv_0, \dots, \bv_k \}. \]

Proof. Let \(L\) be the linear subspace associated with \(A\) and let

\[ H = \affine \{ \bv_0, \dots, \bv_k \}. \]

Since \(A\) being affine is closed under affine combinations, hence \(H \subseteq A\).

We now show that \(A \subseteq H\).

  1. Let \(\bv \in A\).

  2. Then, \(\bv - \bv_0 \in L\).

  3. By Theorem 4.175, \(\bv_1 - \bv_0, \dots, \bv_k - \bv_0\) form a basis for \(L\).

  4. Thus,

    \[ \bv - \bv_0 = t_1 (\bv_1 - \bv_0) + \dots + t_k (\bv_k - \bv_0). \]
  5. But then,

    \[ \bv = (1 - t_1 - \dots - t_k) \bv_0 + t_1 \bv_1 + \dots + t_k \bv_k \]

    which is an affine combination of \(\{ \bv_0, \dots, \bv_k \}\).

  6. Thus, \(\bv \in H\).

  7. Thus, \(A \subseteq H\).

Theorem 4.179 (Extending an affine independent set of points)

Let \(\VV\) be a finite dimensional vector space with \(n = \dim \VV\). Any set of \(m+1\) affine independent points in \(\VV\) (where \(m < n\)) can be extended to a set of \(n+1\) affine independent points.

Proof. Let \(S = \{\bv_0, \bv_1, \dots, \bv_m \}\) be a set of \(m\) affine independent points.

  1. Let \(A = \affine S\).

  2. Let \(L\) be the linear subspace associated with \(A\).

  3. Let \(\bx_1 = \bv_1 - \bv_0, \dots, \bx_m = \bv_m - \bv_0\).

  4. The set \(\{\bx_1, \dots, \bx_m \}\) forms a basis for \(L\).

  5. Extend \(\{\bx_1, \dots, \bx_m \}\) to \(\{\bx_1, \dots, \bx_n \}\) to form a basis for \(\VV\).

  6. Compute the points \(\bv_i = \bx_i + \bv_0\) for \(i=m+1, \dots, n\).

  7. Then, the set of points \(\{ \bv_0, \dots, \bv_n \}\) is an affine independent set since the \(\{\bx_1, \dots, \bx_n \}\) are linearly independent.

4.14.8. Barycentric Coordinate System#

Theorem 4.180 (Unique representation from affine independent points)

Let \(\VV\) be a vector space and \(\bv_0, \bv_1, \dots, \bv_k\) be a set of \(k+1\) affine independent points in \(\VV\).
Let \(S = \{ \bv_0, \bv_1, \dots, \bv_k \}\). Let \(A = \affine S\).

Then, every point in \(A\) can be represented uniquely as

\[ \bv = t_0 \bv_0 + \dots + t_k \bv_k \]

such that \(t_0 + \dots + t_k = 1\).

Proof. By definition of affine hull, any point in the hull \(A\) is an affine combination of the points in \(S\). We shall first find a suitable representation as an affine combination of \(S\). Then, we shall prove the uniqueness of such a representation by showing that if the representation is not unique then the points in \(S\) cannot be affine independent.

Let \(L\) be the linear subspace associated with \(A\). Let \(\bx_1 = \bv_1 - \bv_0, \dots, \bx_k = \bv_k - \bv_0\). Then, by Theorem 4.175, the set \(\BBB = \{ \bx_1, \dots, \bx_k\}\) forms a basis for \(L\).

Let \(\bv \in A\). Then, \(\bx = \bv - \bv_0 \in L\).

Then, there is a unique representation of \(\bx\) in the basis \(\BBB\):

\[ \bx = s_1 \bx_1 + \dots s_k \bx_k. \]

Then,

\[\begin{split} \bv &= \bx + \bv_0\\ &= s_1 \bx_1 + \dots s_k \bx_k + \bv_0\\ &= s_1 (\bv_1 - \bv_0) + \dots s_k (\bv_k - \bv_0) + \bv_0\\ &= (1 - s_1 - \dots - s_k) \bv_0 + s_1 \bv_1 + \dots s_k \bv_k. \end{split}\]

Letting \(t_0 = 1 - s_1 - \dots - s_k\) and \(t_i = s_i\) for \(i=1,\dots,k\) we arrive at a representation of \(\bv\) in terms of points in \(S\) such that \(t_0 + \dots + t_k = 1\) and

\[ \bv = t_0 \bv_0 + t_1 \bv_1 + \dots + t_k \bv_k. \]

We now claim that this representation is unique. Suppose, there was another representation

\[ \bv = r_0 \bv_0 + r_1 \bv_1 + \dots + r_k \bv_k \]

such that \(r_0 + \dots + r_k = 1\).

Then, we would have:

\[\begin{split} & r_0 \bv_0 + r_1 \bv_1 + \dots + r_k \bv_k = t_0 \bv_0 + t_1 \bv_1 + \dots + t_k \bv_k\\ & \iff (1 - r_1 - \dots - r_k) \bv_0 + r_1 \bv_1 + \dots + r_k \bv_k = (1 - t_1 - \dots - t_k) \bv_0 + t_1 \bv_1 + \dots + t_k \bv_k\\ & \iff r_1 (\bv_1 - \bv_0) + \dots + r_k (\bv_k - \bv_0) + \bv_0 = t_1 (\bv_1 - \bv_0) + \dots + t_k (\bv_k - \bv_0) + \bv_0\\ & \iff r_1 \bx_1 + \dots + r_k \bx_k = t_1 \bx_1 + \dots + t_k \bx_k\\ & \iff (r_1 - t_1) \bx_1 + \dots + (r_k - t_k) \bx_k = \bzero. \end{split}\]

But, the set \(\{ \bx_1, \dots, \bx_k \}\) is linearly independent since \(\bv_0, \bv_1, \dots, \bv_k\) are affine independent.

Hence, \(r_1 = t_1, \dots, r_k = t_k\) must be true.

Thus, \(r_0 = t_0\) must be true since \(r_0 = 1 - r_1 - \dots - r_k\) and \(t_0 = 1 - t_1 - \dots - t_k\).

Thus, \(\bv\) has a unique representation in \(S\).

This unique representation can be used to define a coordinate system in an affine set.

Definition 4.167 (Barycentric coordinate system)

Let \(\VV\) be a vector space and \(\bv_0, \bv_1, \dots, \bv_k\) be a set of \(k+1\) affine independent points in \(\VV\).
Let \(S = \{ \bv_0, \bv_1, \dots, \bv_k \}\). Let \(A = \affine S\).

Then, every point in \(A\) can be represented uniquely as

\[ \bv = t_0 \bv_0 + \dots + t_k \bv_k \]

such that \(t_0 + \dots + t_k = 1\). This representation is known as the barycentric coordinate system.

If \(A\) is an arbitrary finite dimensional affine subspace of \(\VV\) with \(\dim A = k\), then we can select \(k+1\) affine independent points \(\bv_0, \bv_1, \dots, \bv_k \in A\) (thanks to Theorem 4.177). Any such set of \(k+1\) affine independent points of \(A\) affords \(A\) with a barycentric coordinate system.

4.14.9. Translations#

Definition 4.168 (Translation operator)

Let \(\VV\) be a vector space. An operator \(T_{\ba} : \VV \to \VV\) is called a translation operator if

\[ T_{\ba}(\bx) = \bx + \ba \Forall \bx \in \XX \]

where \(\ba \in \XX\) is a fixed (translation) vector.

It can be easily seen that \(T_{\ba}(C) = \ba + C = C + \ba\).

Definition 4.169 (Translate)

Let \(C \subseteq \VV\). The translate of \(C\) by some \(\ba \in \VV\) is defined to be the set \(C + \ba\).

Observation 4.11 (Translating the vector space)

\[ \VV + \ba = \VV \Forall \ba \in \VV. \]

Translating the whole vector space doesn’t change it.

\[ \EmptySet + \ba = \EmptySet. \]

This follows from the definition of the set vector addition.

\[ \{ \bzero \} + \ba = \{ \ba \}. \]

The translate of the trivial subspace is a singleton set.

Theorem 4.181 (Affine translate)

A translate of an affine set is affine.

Proof. Let \(C\) be affine and \(\ba \in \VV\).

  1. Let \(\bx, \by \in C + \ba\).

  2. Then, \(\bx = \bu + \ba\) and \(\by = \bv + \ba\) for some \(\bu, \bv \in C\).

  3. Then for some \(t \in \FF\),

    \[\begin{split} t \bx + (1-t) \by &= t (\bu + \ba) + (1-t) (\bv + \ba)\\ &= t \bu + (1-t)\bv + t \ba + (1-t)\ba\\ &= t \bu + (1-t)\bv + \ba. \end{split}\]
  4. But \(\bw = t \bu + (1-t)\bv \in C\) since \(C\) is affine.

  5. Hence, \(t \bx + (1-t) \by = \bw + \ba \in C + \ba\).

  6. Thus, \(C + \ba\) is affine.

Definition 4.170 (Parallel affine sets)

Two affine sets \(C\) and \(D\) are called parallel to each other if

\[ D = C + \ba \]

for some \(\ba \in \VV\). We denote this by \(C \parallel D\).

Clearly, every affine set is parallel to its associated linear subspace.

This definition of parallelism is more restrictive as it allows comparing only those affine sets which have the same dimension. Thus, we cannot compare a line with a plane.

Every point is parallel to every other point.

Theorem 4.182 (Parallelism equivalence relation)

Consider the class of all affine subsets of a vector space \(\VV\). The relation \(C \parallel D\) is an equivalence relation.

Proof. [Reflexivity]

  1. \(C = C + \bzero\).

  2. Hence \(C \parallel C\).

[Symmetry]

  1. Let \(C \parallel D\).

  2. Then, there exists \(\ba \in \VV\) such that \(D = C + \ba\).

  3. But then, \(C = D + (-\ba)\).

  4. Thus, \(D \parallel C\).

[Transitivity]

  1. Let \(C \parallel D\) and \(D \parallel E\).

  2. Then, \(D = C + \ba\) and \(E = D + \bb\) for some \(\ba, \bb \in \VV\).

  3. But then, \(E = C + (\ba + \bb)\).

  4. Thus, \(C \parallel E\).

Theorem 4.183 (Existence and uniqueness of a parallel linear subspace)

Every affine subspace (nonempty affine set) \(A\) is parallel to a unique subspace. The subspace is given by:

\[ W = A - A. \]

This result is a restatement of Observation 4.10.

Proof. From Theorem 4.168, there is a unique linear subspace \(L\) associated with \(A\) given by \(L = A - \ba\) for some \(\ba \in A\).

Since \(A = L + \ba\) hence, \(A\) and \(L\) are parallel to each other.

Two linear subspaces are parallel to each other only if they are identical. Thus, \(L\) is the unique linear subspace parallel to \(A\).

Now, notice that:

\[ W = A - A = \bigcup_{\ba \in A} A - \ba. \]

But \(L = A - \ba\) for any \(\ba \in A\) as \(L\) is independent of the choice of \(\ba \in A\).

Thus,

\[ W = \bigcup_{\ba \in A} A - \ba = \bigcup_{\ba \in A} L = L. \]

Thus, the unique linear subspace parallel to \(A\) is given by \(W = A - A\).

4.14.10. Affinity Preserving Operations#

We discuss some operations which preserve the affine character of its inputs

4.14.10.1. Intersection#

Theorem 4.184 (Intersection of affine sets)

If \(S_1\) and \(S_2\) are affine sets then \(S_1 \cap S_2\) is affine.

Proof. Let \(\bx_1, \bx_2 \in S_1 \cap S_2\). We have to show that

\[ t \bx_1 + (1 - t) \bx_2 \in S_1 \cap S_2, \Forall t \in \FF. \]

Since \(S_1\) is affine and \(\bx_1, \bx_2 \in S_1\), hence

\[ t \bx_1 + (1 - t) \bx_2 \in S_1, \Forall t \in \FF. \]

Similarly

\[ t \bx_1 + (1 - t) \bx_2 \in S_2, \Forall t \in \FF. \]

Thus

\[ t \bx_1 + (1 - t) \bx_2 \in S_1 \cap S_2, \Forall t \in \FF. \]

Thus, \(S_1 \cap S_2\) is affine.

We can generalize it further.

Theorem 4.185 (Intersection of arbitrary collection of affine sets)

Let \(\{ A_i\}_{i \in I}\) be a family of sets such that \(A_i\) is affine for all \(i \in I\). Then \(\cap_{i \in I} A_i\) is affine.

Proof. Let \(\bx_1, \bx_2\) be any two arbitrary elements in \(\cap_{i \in I} A_i\).

\[\begin{split} &\bx_1, \bx_2 \in \cap_{i \in I} A_i\\ \implies & \bx_1, \bx_2 \in A_i \Forall i \in I\\ \implies &t \bx_1 + (1 - t) \bx_2 \in A_i \Forall t \in \FF \Forall i \in I \text{ since $A_i$ is affine }\\ \implies &t \bx_1 + (1 - t) \bx_2 \in \cap_{i \in I} A_i. \end{split}\]

Hence \(\cap_{i \in I} A_i\) is affine.

4.14.11. Hyper Planes#

Recall from Definition 4.87 that a set of the form:

\[ H_{\bf, a} \triangleq \{ \bx \in \VV \ST \bf(\bx) = a \} \]

where \(\bf\) is a nonzero linear functional on \(\VV\) and \(a \in \FF\) is called a hyperplane.

Theorem 4.186

Every hyperplane is affine.

Proof. We proceed as follows:

  1. Let \(\bx, \by \in H_{\bf, a}\).

  2. Then, \(\bf(\bx) = a\) and \(\bf(\by) = a\).

  3. Consider any \(t \in \FF\) and let \(\bz = t \bx + (1-t) \by\).

  4. Then, due to linearity of \(\bf\),

    \[\begin{split} \bf(\bz) &= \bf(t \bx + (1-t) \by)\\ &= t \bf(\bx) + (1-t) \bf(\by)\\ &= t a + (1-t) a = a. \end{split}\]
  5. Thus, \(\bz \in H_{\bf, a}\).

  6. Thus, \(H_{\bf, a}\) is an affine set.

Theorem 4.187 (Linear subspace parallel to a hyperplane)

Let \(H\) be a hyperplane given by

\[ H = \{ \bx \in \VV \ST \bf(\bx) = a \} \]

where \(\bf\) is a nonzero linear functional on \(\VV\) and \(a \in \FF\).

Then, the linear subspace parallel to \(H\) is given by the kernel of the linear functional \(\bf\):

\[ L = \bf^{-1}(0) = \{ \bx \in \VV \ST \bf(\bx) = 0 \}. \]

Proof. Let \(V\) be the linear subspace parallel to \(H\).

  1. Then, any \(\bv \in V\) can be written as \(\bv = \bx - \by\) for some \(\bx, \by \in H\).

  2. But then,

    \[ \bf(\bv) = \bf (\bx - \by) = \bf (\bx) - \bf(\by) = a - a = 0. \]
  3. Thus, \(\bv \in L\) and hence \(V \subseteq L\).

For the converse, we proceed as follows.

  1. Let \(\bv \in L\) and \(\bx \in H\).

  2. Let \(\by = \bx - \bv\).

  3. Then, \(\bf(\by) = \bf (\bx) - \bf (\bv) = a - 0 = a\).

  4. Thus, \(\by \in H\).

  5. Thus, \(\bv = \bx - \by\) where \(\bx, \by \in H\).

  6. Thus, \(\bv \in H - H = V\).

  7. Thus, \(L \subseteq V\).

Combining, \(L = H\).

Theorem 4.188 (Dimension of a hyperplane)

Let \(H\) be a hyperplane given by

\[ H = \{ \bx \in \VV \ST \bf(\bx) = a \} \]

where \(\bf\) is a nonzero linear functional on \(\VV\) and \(a \in \FF\).

If \(\VV\) is finite dimensional, then

\[ \dim H = \dim \VV - 1. \]

Proof. From Theorem 4.187, the linear subspace parallel to \(H\) is given by

\[ L = \bf^{-1}(0) = \{ \bx \in \VV \ST \bf(\bx) = 0 \}. \]

From Theorem 4.99, the dimension of the kernel of a linear functional in a finite dimensional vector space is given by:

\[ \dim L = \dim \VV - 1. \]

From Definition 4.163,

\[ \dim H = \dim L = \dim \VV - 1. \]

Theorem 4.189 (Hyperplanes in inner product spaces)

If \(\VV\) is an inner product space over \(\FF\), then a set of the form

\[ H = \{\bx \ST \langle \bx, \ba \rangle = b \} \]

where \(\ba \in \VV\) is a nonzero vector and \(b \in \FF\); is a hyperplane.

Moreover, every hyperplane of \(\VV\) can be represented in this form, with \(\ba\) and \(b\) unique up to a common non-zero multiple.

Proof. By Theorem 4.102, the mapping \(T_{\ba} : \VV \to \FF\) defined by:

\[ T_{\ba} (\bx) \triangleq \langle \bx , \ba \rangle \Forall \bx \in \VV \]

is a linear functional. Thus, \(H\) is a hyperplane.

By Theorem 4.104, every linear functional can be identified as an inner product with a vector \(\ba \in \VV\). Thus, every hyperplane can be written as

\[ H = \{\bx \ST \langle \bx, \ba \rangle = b \}. \]

This representation is not unique since the set

\[ \{\bx \ST \langle \bx, \overline{t} \ba \rangle = t b \} \]

is identical to \(H\) for any \(t \in \FF\) such that \(t \neq 0\).

Theorem 4.190 (Affine = Intersection of hyperplanes)

Let \(\VV\) be a finite dimensional vector space. Then, every proper affine subset of \(\VV\) is a finite intersection of hyperplanes.

Proof. If \(C = \EmptySet\), we can choose any two non-intersecting parallel hyperplanes and \(C\) is their intersection.

Let \(C\) be a proper affine subspace of \(\VV\) such that \(1 \leq \dim C < \dim \VV\).

  1. Let \(L\) be the linear subspace parallel to \(C\).

  2. Then \(C = L + \ba\) for some fixed \(\ba \in C\).

  3. Let \(n = \dim \VV\) and \(m = \dim L\).

  4. Since \(L\) is a proper subspace of \(\VV\) hence \(m < n\).

  5. Let \(\{\bx_1, \dots, \bx_m \}\) be a basis for \(L\).

  6. Then, every \(\bv \in C\) can be written as:

    \[ \bv = \sum_{i=1}^m t_i \bx_i + \ba. \]
  7. We can extend this basis to construct a basis \(\{\bx_1, \dots, \bx_n \}\) for \(\VV\).

  8. We can construct a dual basis for the dual space \(\VV^*\). For each \(i=1,\dots,n\), define a linear functional \(\bf_i : \VV \to \FF\) by setting:

    \[\begin{split} \bf_i(\bx_j) = \begin{cases} 1 && \text{ if } && i = j\\ 0 && \text{ if } && i \neq j \end{cases}. \end{split}\]
  9. Let \(a_i = \bf_i (\ba)\).

  10. Consider a family of hyperplanes defined as:

    \[ H_i = \{ \bx \in \VV \ST \bf_i(\bx) = a_i \} \]

    where \(i=m+1, \dots, n\).

  11. Consider their intersection

    \[ H = \bigcap_{i=m+1}^n H_i = \{ \bx \in \VV \ST \bf_i(\bx) = a_i, i=m+1,\dots, n \}. \]
  12. We claim that \(C = H\).

We shall first show that \(C \subseteq H\).

  1. Let \(\bv \in C\).

  2. Then, \(\bv = \sum_{j=1}^m t_j \bx_j + \ba\).

  3. Then, \(\bf_i (\bv) = \sum_{j=1}^m t_j \bf_i(\bx_j) + \bf_i(\ba) = a_i\) for \(i=m+1, \dots, n\).

  4. Thus, \(\bv \in H_i\) for every \(i=m+1, \dots, n\).

  5. Thus, \(\bv \in H\).

  6. Thus, \(C \subseteq H\).

We now show that \(H \subseteq C\). Note that this is same as showing \(H - \ba \subseteq L = C - \ba\).

  1. Let \(\bv \in H - \ba\).

  2. Hence, \(\bv = \bx - \ba\) such that \(\bx \in H\).

  3. We can write \(\bv\) in terms of the basis \(\{\bx_1, \dots, \bx_n\}\) as

    \[ \bv = \sum_{j=1}^n t_j \bx_j. \]
  4. Then \(\bf_i(\bv) = t_i\) (by definition of \(\bf_i\)).

  5. But, for any \(i \in [m+1, \dots, n]\)

    \[ \bf_i(\bv) = \bf_i(\bx - \ba) = \bf_i (\bx) - \bf_i(\ba) = a_i - a_i = 0 \]

    since \(\bx \in H \subseteq H_i\).

  6. Thus, \(t_i = 0\) for every \(i=m+1, \dots, n\).

  7. Thus,

    \[ \bv = \sum_{j=1}^m t_j \bx_j. \]
  8. Thus, \(\bv \in L\) since \(\{\bx_1, \dots, \bx_m\}\) is a basis for \(L\).

  9. Thus, \(H - \ba \subseteq L\).

  10. Thus, \(H \subseteq L + \ba = C\).

Combining these observations, we have \(H = C\).

We are now left with the case of singleton sets \(C = \{ \ba \}\) where \(\dim C = 0\) since the associated linear subspace is \(\{ \bzero \}\).

  1. Choose any basis \(\BBB = \{\bx_1, \dots, \bx_n\}\) for \(\VV\).

  2. Construct a dual basis \(\FFF = \{\bf_1, \dots, \bf_n \}\) for \(\VV^*\) as before.

  3. Let \(a_i = \bf_i(\ba)\) for \(i=1,\dots, n\).

  4. Consider a family of hyperplanes defined as:

    \[ H_i = \{ \bx \in \VV \ST \bf_i(\bx) = a_i \} \]

    where \(i=1, \dots, n\).

  5. Consider their intersection

    \[ H = \bigcap_{i=1}^n H_i = \{ \bx \in \VV \ST \bf_i(\bx) = a_i, i=1,\dots, n \}. \]
  6. Now, it is straightforward to show that \(H = \{\ba \} = C\).

Corollary 4.32 (Affine sets in inner product space)

Let \(\VV\) be a finite dimensional inner product space over field \(\FF\). Let \(A\) be a proper affine subset of \(\VV\).

Then, there exist \(r\) (where \(r < \dim \VV\)) nonzero vectors \(\ba_i \in \VV\) and scalars \(b_i \in \FF\) such that \(A\) is the intersection of the hyperplanes given by

\[ H_i = \{\bx \in \VV \ST \langle \bx, \ba_i \rangle = b_i \}. \]

Specifically,

\[ A = \bigcap_{i=1}^m H_i = \{\bx \in \VV \ST \langle \bx, \ba_i \rangle = b_i, i=1,\dots,m \}. \]

Proof. It follows from Theorem 4.190 that \(A\) is a finite intersection of hyperplanes with \(r < n\) where \(n = \dim \VV\).

Since \(\VV\) is an inner product space, hence, due to Theorem 4.189, each hyperplane can be represented as

\[ H_i = \{\bx \in \VV \ST \langle \bx, \ba_i \rangle = b_i \} \]

where \(\ba_i \in \VV\) and \(b_i \in \FF\).

Procedure to select the hyperplane parameters.

  1. Pick a vector \(\ba \in A\).

  2. Identify the linear subspace \(L = A - \ba\).

  3. Pick an orthonormal basis for \(L\): \(\{\bv_1, \dots, \bv_m \}\).

  4. Extend the orthonormal basis to \(\VV\).

  5. Pick the basis vectors for \(L^{\perp}\): \(\{\ba_1, \dots, \ba_r \}\) with \(m + r = n\).

  6. Compute \(b_i = \langle \ba, \ba_i \rangle\).

4.14.12. Linear Equations#

Example 4.35 (Solution set of linear equations)

We show that the solution set of linear equations forms an affine set.

Let \(C = \{ \bx \ST \bA \bx = \bb\}\) where \(\bA \in \FF^{m \times n}\) and \(\bb \in \FF^m\).

Let \(C\) be the set of all vectors \(\bx \in \FF^n\) which satisfy the system of linear equations given by \(\bA \bx = \bb\). Then \(C\) is an affine set.

Let \(\bx_1\) and \(\bx_2\) belong to \(C\). Then we have

\[ \bA \bx_1 = \bb \text{ and } \bA \bx_2 = \bb \]

Thus

\[\begin{split} &\theta \bA \bx_1 + ( 1 - \theta ) \bA \bx_2 = \theta \bb + (1 - \theta ) \bb\\ &\implies \bA (\theta \bx_1 + (1 - \theta) \bx_2) = \bb\\ &\implies (\theta \bx_1 + (1 - \theta) \bx_2) \in C \end{split}\]

Thus, \(C\) is an affine set.

The subspace associated with \(C\) is nothing but the null space of \(\bA\) denoted as \(\NullSpace(\bA)\).

Every affine set of \(\FF^n\) can be expressed as the solution set of a system of linear equations. If the system of equations is infeasible, then its solution set is \(\EmptySet\). Otherwise, its solution set is an affine subspace. If the system of equations has a unique solution, then the solution set is a singleton set which is an affine subspace of dimension 0.

Theorem 4.191 (Affine set = system of linear equations in \(\FF^n\))

Let \(\bb \in \FF^m\). Let \(\bA\) be an \(m \times n\) matrix in \(\FF^{m \times n}\). Consider the solution set of the system of linear equations \(\bA \bx = \bb\):

\[ C = \{\bx \in \FF^n \ST \bA \bx = \bb \}. \]

Then, \(C\) is an affine set.

Moreover, every affine set in \(\FF^n\) can be represented as a system of linear equations.

Proof. If \(C = \EmptySet\) (i.e., the system of equations is infeasible), then \(C\) is affine (since empty sets are affine by definition).

If the system of equations has a unique solution, then \(C = \{ \bv \}\) where \(\bv\) is the unique solution of the system of equations, then \(C\) is affine since singleton sets are affine.

We now consider the case that the system of equations has more than one solutions.

Let \(\bx_1, \bx_2 \in C\) be distinct solutions of the system of linear equations and let \(t \in \FF\). Then,

\[ \bA \bx_1 = \bb \text{ and } \bA \bx_2 = \bb \]

Consider \(\bx = t \bx_1 + (1-t) \bx_2\). Then,

\[\begin{split} \bA \bx &= \bA (t \bx_1 + (1-t) \bx_2) \\ &= t \bA \bx_1 + (1-t) \bA \bx_2 \\ &= t \bb + (1-t) \bb = \bb. \end{split}\]

This means that \(\bx \in C\). Thus, \(C\) contains all its affine combinations. Hence, \(C\) is affine.

We next show that every affine set of \(\FF^n\) can be represented as a system of linear equations. Note that \(\FF^n\) is an inner product space with the standard inner product given by \(\langle \bx, \by \rangle = \overline{\by} \bx\).

Let \(C\) be an arbitrary affine set in \(\FF^n\).

  1. If \(C = \EmptySet\), we can pick any infeasible system of linear equations as a representation of \(C\).

  2. If \(C = \{ \bv \}\) is a singleton, we can pick the system \(\bI \bx = \bv\) where \(I\) is an identity matrix in \(\FF^{n \times n}\).

  3. If \(C = \FF^n\), we can choose \(\bA\) to be any \(m \times n\) zero matrix and \(\bb = \bzero \in \FF^m\). Then, the solution set of \(\ZERO \bx = \bzero\) is all of \(\FF^n\).

  4. We shall now consider the case of affine \(C\) with more than one elements and \(C \subset \FF^n\) (proper subset).

  5. Let \(L\) be the subspace parallel to \(C\) (Theorem 4.183).

  6. Let \(L^{\perp}\) be the orthogonal complement of \(L\).

  7. Let \(\BBB = \{\bv_1, \dots, \bv_m \}\) be a basis for \(L^{\perp}\) (where \(m < n\)).

  8. Since \(\FF^n\) is finite dimensional, hence \(L = \left (L^{\perp} \right )^{\perp}\) (Theorem 4.88).

  9. Thus, due to Theorem 4.84,

    \[ L = \{ \bx \ST \bx \perp \bv_1, \dots, \bx \perp \bv_m \}. \]
  10. Thus,

    \[ L = \{\bx \ST \langle \bx, \bv_i \rangle = 0, i=1,\dots, m \} = \{ \bx \ST \bA \bx = \bzero \} \]

    where \(\bA\) is the \(m \times n\) matrix whose rows are \(\bv_1, \dots, \bv_m\).

  11. Since \(C\) is parallel to \(L\), there exists an \(\ba \in \FF^n\) such that

    \[ C = L + \ba = \{ \bx \ST \bA (\bx - \ba) = \bzero \} = \{ \bx \ST \bA \bx = \bb \} \]

    where \(\bb = \bA \ba\).

4.14.13. Affine Transformations#

Definition 4.171 (Affine transformation)

Let \(\XX\) and \(\YY\) be vector spaces (on some field \(\FF\)). A (total) function \(T : \XX \to \YY\) is called an affine transformation if for every \(\bx,\by \in \XX\) and for every \(t \in \FF\)

\[ T (t \bx + (1 - t) \by) = t T(\bx) + (1 - t) T(\by). \]

An affine transformation is also known as an affine function or an affine operator.

An affine transformation preserves affine combinations. An affine combination in input leads to an identical affine combination in output.

4.14.13.1. Relation with Linear Transformations#

We next show that a linear transformation followed by a translation is affine.

Theorem 4.192 (Linear + Translation \(\implies\) Affine)

Let \(\XX\) and \(\YY\) be vector spaces (on some field \(\FF\)). Let \(L : \XX \to \YY\) be a linear transformation and let \(\ba \in \YY\). Define \(T : \XX \to \YY\) as

\[ T (\bx) = L (\bx) + \ba. \]

Then, \(T\) is an affine transformation.

Proof. Let \(\bx, \by \in \XX\) and \(t \in \FF\). Then

\[\begin{split} T (t \bx + (1 - t) \by) &= L (t \bx + (1 - t) \by) + \ba \\ &= t L(\bx) + (1 -t) L (\by) + \ba \\ &= t L(\bx) + t \ba + (1 -t) L (\by) + (1-t)\ba \\ &= t (L (\bx) + \ba) + (1 -t) (L (\by) + \ba)\\ &= tT (\bx) + (1- t) T(\by). \end{split}\]

Thus, \(T\) is affine.

We now prove a stronger result that every affine function is a linear transformation followed by a translation.

Theorem 4.193 (Affine = Linear + Translation)

Let \(\XX\) and \(\YY\) be vector spaces (on some field \(\FF\)). Let \(T : \XX \to \YY\) be some mapping. Then, \(T\) is affine if and only if the mapping \(\bx \mapsto T(\bx) - T(\bzero)\) is linear.

In other words, an affine transformation can be written as a linear transformation followed by a translation and vice-versa.

Proof. Define:

\[ L (\bx) = T (\bx) - T(\bzero). \]

Notice that \(L(\bzero) = T(\bzero) - T(\bzero) = \bzero\). Thus, \(L\) maps zero vector from \(\XX\) to the zero vector of \(\YY\).

We need to show that

\[ L \text{ linear } \iff T \text{ affine}. \]

We shall show it in two steps.

  1. Show that if \(T\) is affine, then \(L\) must be linear.

  2. Show that if \(L\) is linear, then \(T\) must be affine.

Assume \(T\) to be affine. We shall show that \(L\) is linear.

Let \(\bx, \by \in \XX\) and \(t \in \FF\).

[Scalar multiplication]

\[\begin{split} L(t\bx) &= T (t\bx) - T(\bzero)\\ &= T(t\bx + (1-t) \bzero) - T(\bzero)\\ &= t T(\bx) + (1-t)T(\bzero) - T(\bzero)\\ &= t (T(\bx) - T(\bzero)) = t L(\bx). \end{split}\]

[Vector addition]

\[\begin{split} L (\bx + \by) &= T (\bx + \by) - T(\bzero)\\ &= T(\frac{1}{2} 2 \bx + \frac{1}{2} 2 \by) - T(\bzero)\\ &= \frac{1}{2} T (2 \bx) + \frac{1}{2} T(2 \by) - T(\bzero)\\ &= \frac{1}{2} (T (2\bx) - T(\bzero)) + \frac{1}{2}( T(2\by) - T(\bzero))\\ &= \frac{1}{2} (L (2\bx) + L (2\by))\\ &= \frac{1}{2} (2 L (\bx) + 2 L (\by))\\ &= L(\bx) + L (\by). \end{split}\]

Thus, \(L\) is linear. Here, we used the fact that \(L(2\bx) = T(2\bx) - T(\bzero)\) and \(L\) was already shown to be homogeneous above giving \(L(2\bx) = 2 L(\bx)\).

Now, assume \(L\) to be linear. We shall show that \(T\) is affine.

Let \(\bx, \by \in \XX\) and \(t \in \FF\). Then

\[\begin{split} T (t \bx + (1 - t) \by) &= L (t \bx + (1 - t) \by) + T(\bzero) \\ &= t L(\bx) + (1 -t) L (\by) + T(\bzero) \\ &= t L(\bx) + t T(\bzero) + (1 -t) L (\by) + (1-t)T(\bzero) \\ &= t (L (\bx) + T(\bzero)) + (1 -t) (L (\by) + T(\bzero))\\ &= tT (\bx) + (1- t) T(\by). \end{split}\]

Thus, \(T\) is affine.

4.14.13.2. Affine Combinations and Hulls#

We show that affine functions distribute over arbitrary affine combinations.

Theorem 4.194 (Affine functions on affine combinations)

Let \(\XX\) and \(\YY\) be vector spaces on a field \(\FF\). Let \(T : \XX \to \YY\) be affine.

Let \(\bx_0, \bx_1, \dots, \bx_k \in \XX\) and \(t_0, t_1, \dots, t_k \in \FF\) such that \(\sum_{i=0}^k t_i = 1\). Then,

\[ T \left ( \sum_{i=0}^k t_i \bx_i \right ) = \sum_{i=0}^k t_i T(\bx_i). \]

Proof. Define:

\[ L(\bx) = T(\bx) - T(\bzero). \]

We know that \(L\) is linear. We have \(T(\bx) = L(\bx) + T(\bzero)\).

Now,

\[\begin{split} T \left ( \sum_{i=0}^k t_i \bx_i \right ) &= L \left ( \sum_{i=0}^k t_i \bx_i \right ) + T(\bzero)\\ &= \sum_{i=0}^k t_i L( \bx_i) + T(\bzero)\\ &= \sum_{i=0}^k t_i L( \bx_i) + (\sum_{i=0}^k t_i) T(\bzero)\\ &= \sum_{i=0}^k t_i \left (L( \bx_i) + T(\bzero) \right )\\ &= \sum_{i=0}^k t_i T( \bx_i). \end{split}\]

Theorem 4.195 (Preservation of affine hulls)

Let \(\XX\) and \(\YY\) be vector spaces on a field \(\FF\). Let \(T : \XX \to \YY\) be affine. Let \(S \subseteq \XX\). Then,

\[ \affine T (S) = T (\affine S); \]

i.e., the affine hull of \(T(S)\) is same as \(T(A)\) where \(A\) is the affine hull of \(S\).

Proof. We first show that \(\affine T (S) \subseteq T (\affine S)\)

  1. Let \(\by \in \affine T (S)\).

  2. Then, there exist \(\by_0, \dots, \by_k \in T(S)\) and \(t_0, \dots, t_k \in \FF\) such that \(\sum_{i=0}^k t_i = 1\) and

    \[ \by = \sum_{i=0}^k t_i \by_i. \]
  3. But then, \(\by_i = T(\bx_i)\) for some \(\bx_i \in S\) for every \(i=0,\dots, k\) since \(\by_i \in T(S)\).

  4. Then, due to Theorem 4.194

    \[ \by = \sum_{i=0}^k t_i \by_i = \sum_{i=0}^k t_i T (\bx_i) = T \left (\sum_{i=0}^k t_i \bx_i \right ) \]

    since \(T\) preserves affine combinations.

  5. But, \(\bx = \sum_{i=0}^k t_i \bx_i \in \affine S\) since \(\bx_i \in S\) and \(\bx\) is their affine combination.

  6. Thus, \(\by = T(\bx)\) where \(\bx \in \affine S\).

  7. Thus, \(\by \in T(\affine S)\).

  8. Thus, \(\affine T (S) \subseteq T (\affine S)\).

We now show that \(T (\affine S) \subseteq \affine T (S)\).

  1. Let \(\by \in T (\affine S)\).

  2. Then, there exists \(\bx \in \affine S\) such that \(\by = T(\bx)\).

  3. Then, there exist \(\bx_0, \dots, \bx_k \in S\) and \(t_0, \dots, t_k \in \FF\) such that \(\sum_{i=0}^k t_i = 1\) and

    \[ \bx = \sum_{i=0}^k t_i \bx_i. \]
  4. Then, due to Theorem 4.194

    \[ \by = T(\bx) = T \left ( \sum_{i=0}^k t_i \bx_i \right ) = \sum_{i=0}^k t_i T (\bx_i) \]

    since \(T\) preserves affine combinations.

  5. Let \(\by_i = T(\bx_i)\) for \(i=0,\dots,m\).

  6. Since \(\bx_i \in S\), hence \(\by_i \in T(S)\).

  7. Then,

    \[ \by = \sum_{i=0}^k t_i \by_i. \]
  8. But then, \(\by\) is an affine combination of points in \(T(S)\).

  9. Thus, \(\by \in \affine T(S)\).

  10. Thus, \(T (\affine S) \subseteq \affine T (S)\).

Combining these results:

\[ T (\affine S) = \affine T (S). \]

4.14.13.3. Invertible Affine Transformations#

Theorem 4.196 (Affine invertible = linear invertible)

An affine map is invertible if and only if its corresponding linear map as described in Theorem 4.193 is invertible.

The translation map is invertible. Composition of invertible maps is invertible. Since affine is composition of linear with translation hence affine is invertible if linear is invertible. Similarly, linear is also a composition of affine with translation. Hence, linear is invertible if affine is invertible.

Proof. Formally, let \(\XX\) and \(\YY\) be vector spaces on a field \(\FF\). Let \(T : \XX \to \YY\) be an affine map. Let \(L : \XX \to \YY\) be the linear map given by:

\[ L(\bx) = T(\bx) - T(\bzero). \]

Let \(T(\bzero) = \ba\) and write

\[ T(\bx) = L(\bx) + \ba \text{ and } L(\bx) = T(\bx) - \ba. \]

Define a parameterized translation map \(G_{\bv} : \YY \to \YY\) as:

\[ G_{\bv} (\by) = \by + \bv \Forall \by \in \YY. \]

Note that the inverse of the translation operator is given by:

\[ G^{-1}_{\bv} (\by) = \by - \bv = G_{-\bv} (\by) \]

which is another translation operator. Thus, all translation operators are invertible.

Then,

\[ L = G_{-\ba} \circ T \text{ and } T = G_{\ba} \circ L. \]

Clearly, if \(T\) is invertible then so is \(L\) and if \(L\) is invertible then so is \(T\).

Theorem 4.197 (Inverse of affine map is affine)

Let \(\XX\) and \(\YY\) be vector spaces on a field \(\FF\). Let \(T : \XX \to \YY\) be an affine map. If \(T\) is invertible, then its inverse is also an affine map.

Proof. We are given that \(T\) is affine and its inverse exists. Let \(S : \YY \to \XX\) be the inverse of \(T\). We need to show that \(S\) is affine.

Since \(T\) is invertible, it is bijective.

  1. Let \(\by_1, \by_2 \in \YY\) and \(t \in \FF\).

  2. Then, there exist \(\bx_1, \bx_2 \in \XX\) such that

    \[ \by_1 = T(\bx_1), \by_2 = T(\bx_2) \]

    and \(\bx_1 \neq \bx_2\).

  3. Since \(S = T^{-1}\), hence \(S(\by_1) = \bx_1\) and \(S(\by_2) = \bx_2\).

  4. Let \(\by = t \by_1 + (1-t) \by_2\) and \(\bx = t \bx_1 + (1-t) \bx_2\).

  5. Then

    \[\begin{split} T (\bx ) &= T(t \bx_1 + (1-t) \bx_2)\\ &= t T(\bx_1) + (1-t) T (\bx_2)\\ &= t \by_1 + (1-t) \by_2 = \by \end{split}\]

    since \(T\) is affine.

  6. Thus, \(T(\bx) = \by\).

  7. Consequently, \(S(\by) = \bx\).

  8. But then,

    \[\begin{split} S( t \by_1 + (1-t) \by_2) &= S(\by) = \bx\\ &= t \bx_1 + (1-t) \bx_2\\ &= t S(\by_1) + (1-t) S(\by_2). \end{split}\]

We have shown that for any \(\by_1, \by_2 \in \YY\) and \(t \in \FF\),

\[ S( t \by_1 + (1-t) \by_2) = t S(\by_1) + (1-t) S(\by_2). \]

Therefore, \(S\) is affine.

4.14.13.4. Affine Mapping between Affine Sets#

Theorem 4.198 (Affine mapping between affine independent sets)

Let \(\VV\) be a finite dimensional vector space with \(n = \dim \VV\). Let \(\{\ba_0, \dots, \ba_m\}\) and \(\{\bb_0, \dots, \bb_m \}\) be two affine independent sets in \(\VV\) where \(m \leq n\). Then, there exists an invertible affine map, \(T: \VV \to \VV\) such that \(T \ba_i = \bb_i\) for \(i=0,\dots,m\). If \(m=n\), then \(T\) is unique.

Proof. If \(m=0\); i.e., we need to find a mapping between \(\{ \ba_0\}\) and \(\{ \bb_0 \}\), then we can choose any affine map

\[ T(\bx) = A(\bx) + \bb_0 - A(\ba_0) \]

where \(A: \VV \to \VV\) is a linear map. Now consider the case where \(1 \leq m \leq n\).

By Theorem 4.179, we can extend \(\{\ba_0, \dots, \ba_m\}\) to an affine independent set \(\{\ba_0, \dots, \ba_n\}\) and similarly \(\{\bb_0, \dots, \bb_m \}\) to \(\{\bb_0, \dots, \bb_n \}\).

Both of these sets span the entire \(\VV\) (as affine hull).

The sets \(\BBB_a = \{\ba_1 - \ba_0, \dots, \ba_n - \ba_0\}\) and \(\BBB_b = \{\bb_1 - \bb_0, \dots, \bb_n - \bb_0\}\) are two different bases for \(\VV\).

Then, there exists a unique linear transformation \(A : \VV \to \VV\) which carries \(\BBB_a\) to \(\BBB_b\); i.e.

\[ A (\ba_i - \ba_0) = \bb_i - \bb_0, \Forall i=1,\dots,n. \]

Now, consider the affine map given by

\[ T(\bx) = A (\bx) + \bb_0 - A (\ba_0) \]

where \(\bb_0 - A (\ba_0)\) is a fixed translation.

Then,

\[ T(\ba_i) = A (\ba_i) + \bb_0 - A (\ba_0) = A(\ba_i - \ba_0) + \bb_0 = \bb_i - \bb_0 + \bb_0 = \bb_i. \]

Thus, \(T\) is the desired affine transformation.

If \(m=n\), then the linear transformation \(A\) is uniquely determined by the bases \(\BBB_a\) and \(\BBB_b\) which are given. Thus, \(T\) is unique too.

Corollary 4.33 (Affine mapping between affine sets)

Let \(\VV\) be a finite dimensional vector space with \(n = \dim \VV\). Let \(A\) and \(B\) be two affine subspaces of \(\VV\) of the same dimension; i.e., \(\dim A = \dim B = m\) where \(m \leq n\). Then, there exists a bijective affine transformation \(T : \VV \to \VV\) such that \(T (A) = B\).

Proof. If both \(A\) and \(B\) are singletons given by \(A = \{ \ba \}\) and \(B = \{ \bb \}\), then any affine transformation given by

\[ T(\bx) = A(\bx) + \bb - A(\ba) \]

where \(A: \VV \to \VV\) is a linear transformation will do.

For \(1 \leq m \leq n\), by Theorem 4.177 we can choose \(m+1\) affine independent points \(\{\ba_0, \dots, \ba_m\}\) and \(\{\bb_0, \dots, \bb_m \}\) in \(A\) and \(B\) respectively such that

\[ A = \affine \{\ba_0, \dots, \ba_m\} \text{ and } B = \affine \{\bb_0, \dots, \bb_m \} \]

as per Theorem 4.178.

Then, by Theorem 4.198, an affine mapping \(T: \VV \to \VV\) exists which maps \(\ba_i\) to \(\bb_i\). Since affine mappings preserve affine hulls (Theorem 4.195), hence

\[ T (A) = B. \]

4.14.13.5. Graph#

Theorem 4.199 (Graph of an affine map is affine)

Let \(\XX\) and \(\YY\) be vector spaces on a field \(\FF\). Let \(T : \XX \to \YY\) be an affine map. Let \(\XX \oplus \YY\) be the direct sum of \(\XX\) and \(\YY\).

Let \(G \subseteq \XX \oplus \YY\) be the graph of \(T\) given by

\[ \graph T = \{ (\bx, T(\bx)) \ST \bx \in \XX \}. \]

Then, \(G\) is an affine subset of \(\XX \oplus \YY\).

In other words, graph of an affine map is affine.

Proof. If \(\by = T(\bx)\) then \(\bz = (\bx, \by) \in G\).

Now, let \(\bz_1, \bz_2 \in G\) and \(t \in \FF\).

  1. \(\bz_1 = (\bx_1, \by_1)\) such that \(\by_1 = T(\bx_1)\).

  2. \(\bz_2 = (\bx_2, \by_2)\) such that \(\by_2 = T(\bx_2)\).

  3. Let \(\bz = t \bz_1 + (1-t) \bz_2\).

  4. Then,

    \[\begin{split} \bz &= t (\bx_1, \by_1) + (1-t) (\bx_2, \by_2)\\ &= (t \bx_1 + (1-t) \bx_2, t \by_1 + (1-t)\by_2). \end{split}\]
  5. Since \(T\) is affine, hence

    \[\begin{split} T(t \bx_1 + (1-t) \bx_2) &= t T(\bx_1) + (1-t) T(\bx_2)\\ &= t \by_1 + (1-t)\by_2. \end{split}\]
  6. Thus, \(\bz = (t \bx_1 + (1-t) \bx_2, t \by_1 + (1-t)\by_2) \in G\).

  7. Thus, \(G\) is closed under affine combinations.

  8. Thus, \(G\) is affine.

As an implication, we can see that the graph of a linear map must be an affine set too since every linear map is an affine map. But a linear map maps \(\bzero_x\) to \(\bzero_y\). Thus, its graph contains the origin \((\bzero_x, \bzero_y)\) of \(\XX \oplus \YY\). Thus, the graph of a linear map must be a subspace of \(\XX \oplus \YY\).

4.14.14. Topology in Normed Spaces#

We next consider the special case of a vector space \(\VV\) endowed with a norm \(\| \cdot \| : \VV \to \RR\), which induces a metric \(d: \VV \times \VV \to \RR\) given by:

\[ d (x, y) = \| x - y \|. \]

\(\VV\) equipped with this metric becomes a metric space and is endowed with a metric topology. Useful topological properties of affine sets and transformations are discussed below.

Readers are encouraged to review the material in Normed Linear Spaces before proceeding further as the results presented here develop on the material presented in that section.

Our discussions are restricted to finite dimensional normed linear spaces as linear subspaces are closed (Theorem 4.64) and linear transformations are continuous (Theorem 4.63) in the finite dimensional spaces.

4.14.14.1. Affine Sets#

Theorem 4.200 (Affine sets are closed)

Every affine subset of a finite dimensional normed linear space \(\VV\) is a closed set.

Proof. \(\EmptySet\) and \(\VV\) are closed by definition. Singletons \(\{\bx \}\) are closed due to Theorem 3.8.

All other affine sets are translations of a linear subspace.

  1. By Theorem 4.64, linear subspaces are closed in a finite dimensional normed linear space.

  2. By Theorem 4.46, translations preserve closed sets.

  3. Hence, affine sets of dimension greater than zero which are translates of the linear subspaces are also closed.

Theorem 4.201

Every proper affine subspace of a normed linear space \(\VV\) has an empty interior.

Proof. We proceed as follows:

  1. By Corollary 4.10, every proper linear subspace of \(\VV\) has an empty interior.

  2. A proper affine subspace is a translate of a proper linear subspace.

  3. By Theorem 4.46, if a set has an empty interior, then so does its translate.

Theorem 4.202 (Affine hull and closure)

Let \(\VV\) be a finite dimensional normed linear space. Let \(C \subseteq \VV\). Then,

\[ \affine (\closure C) = \affine C. \]

Proof. Since \(C \subseteq \closure C\), hence \(\affine C \subseteq \affine (\closure C)\).

  1. Let \(A = \affine C\).

  2. By Theorem 4.200, \(A\) is closed.

  3. By definition \(\closure C\) is the smallest closed set that contains \(C\).

  4. By Proposition 3.6, any closed set that contains \(C\) also contains \(\closure C\).

  5. Thus, \(\closure C \subseteq \affine C\).

  6. Now, \(\affine C\) is an affine set.

  7. By definition, the affine hull is the smallest affine set that contains a set.

  8. Hence, \(\affine (\closure C) \subseteq \affine C\).

Together, we have:

\[ \affine (\closure C) = \affine C. \]

4.14.14.2. Affine Transformations#

Theorem 4.203 (Affine transformations from finite dimensional spaces are continuous)

Let \((\VV, \| \cdot \|_v)\) and \((\WW, \| \cdot \|_w)\) be normed linear spaces. Let \(T : \VV \to \WW\) be an affine transformation.

If \(\VV\) is finite dimensional, then \(T\) is continuous.

Proof. We can write \(T\) as the composition of a linear transformation followed by a translation.

  1. By Theorem 4.63, the linear transformation is continuous since \(\VV\) is finite dimensional.

  2. By Theorem 4.45, translations are continuous.

  3. By Theorem 3.46, composition of continuous functions is continuous.

  4. Hence, \(T\) is continuous.

Theorem 4.204 (Affine transformation and closure)

Let \((\VV, \| \cdot \|_v)\) and \((\WW, \| \cdot \|_w)\) be normed linear spaces. Let \(T : \VV \to \WW\) be an affine transformation.

Assume that \(\VV\) is finite dimensional. Let \(A \subseteq \VV\). Then,

\[ T (\closure A) \subseteq \closure T(A). \]

Proof. By Theorem 4.203, \(T\) is continuous.

By Theorem 3.42 (4)

\[ T (\closure A) \subseteq \closure T(A) \]

holds true for every subset \(A\) of \(\VV\).

Recall from Definition 3.64 that a real valued function is closed if every sublevel set is closed.

Theorem 4.205 (Real valued affine functions are closed)

Let \((\VV, \| \cdot \|)\) be an \(n\)-dimensional normed linear space. Let \(T : \VV \to \RR\) be an affine function. Then, \(T\) is closed.

Proof. 1. By Theorem 4.203, \(f\) is continuous.

  1. Let \(a \in \RR\).

  2. The sublevel set for \(a\) is given by \(S_a = \{ \bx \in \VV \ST T(\bx) \leq a \}\).

  3. This is nothing but \(T^{-1} (-\infty, a]\).

  4. The set \((-\infty, a]\) is a closed set.

  5. Since \(T\) is continuous, hence \(T^{-1}(-\infty, a]\) is also closed.

  6. Thus, \(S_a\) is closed for every \(a \in \RR\).

  7. Thus, \(T\) is closed.

4.14.14.3. Affine Homeomorphisms#

Theorem 4.206

Let \(\VV\) be a finite dimensional normed linear space. A bijective affine transformation \(T : \VV \to \VV\) is a homeomorphism.

Proof. We proceed as follows:

  1. By Theorem 4.203, \(T\) is continuous.

  2. Since \(T\) is bijective, hence, \(T^{-1}\) exists.

  3. By Theorem 4.197, \(T^{-1}\) is affine.

  4. Again, by Theorem 4.203, \(T^{-1}\) is continuous.

  5. Thus, \(T\) is a homeomorphism.

Theorem 4.207

Let \(\VV\) be a finite dimensional normed linear space. A bijective affine transformation \(T : \VV \to \VV\) preserves closures.

In other words, for any \(A \subseteq \VV\):

\[ T (\closure A) = \closure (T (A)). \]

Proof. By Theorem 4.206, \(T\) is a homeomorphism.

By Theorem 3.51, homeomorphisms preserve closures.

Thus, for any \(A \subseteq \VV\)

\[ T (\closure A) = \closure (T (A)). \]

Theorem 4.208

Let \(\VV\) be a finite dimensional normed linear space. A bijective affine transformation \(T : \VV \to \VV\) preserves interiors.

In other words, for any \(A \subseteq \VV\):

\[ T (\interior A) = \interior (T (A)). \]

Proof. By Theorem 4.206, \(T\) is a homeomorphism.

By Theorem 3.52, homeomorphisms preserve interiors.

Thus, for any \(A \subseteq \VV\)

\[ T (\interior A) = \interior (T (A)). \]

4.14.15. Real Valued Affine Functions#

In this subsection, we look at affine functions from a vector space \(\VV\) to the real line \(\RR\).

Theorem 4.209 (Level sets of real valued affine functions)

Let \(\VV\) be a vector space. Let \(h : \VV \to \RR\) be an affine function. Then, for any \(c \in \RR\), the set \(h^{-1}(c)\) is an affine set where

\[ h^{-1}(c) = \{ \bx \in \VV \ST h(\bx) = c \}. \]

Proof. We are given that \(h : \VV \to \RR\) is affine.

  1. Let \(c \in \RR\).

  2. If \(h^{-1}(c)\) is empty, then it is affine and there is nothing to prove. So assume that it is nonempty.

  3. Let \(\bx, \by \in h^{-1}(c)\).

  4. Thus, \(h(\bx) = h(\by) = c\).

  5. Let \(t \in \FF\).

  6. Let \(\bz = t \bx + (1-t) \by\).

  7. Then, by affine nature of \(h\)

    \[ h(\bz) = h (t \bx + (1-t) \by) = t h(\bx) + (1-t) h(\by) = t c + (1 -t) c = c. \]
  8. Thus, \(\bz \in h^{-1}(c)\).

  9. Thus, for any \(\bx, \by \in h^{-1}(c)\) and \(t \in \FF\), \(\bz = t \bx + (1-t) \by \in h^{-1}(c)\).

  10. Thus, \(h^{-1}(c)\) is an affine set.