7.12. Separation Theorems#

Separation theorems provide us ways to find a separating hyperplane between two disjoint convex sets. Hyperplanes require the notion of an inner product.

Throughout this section, we assume that \(\VV\) is a finite dimensional real vector space endowed with norm \(\| \cdot \| : \VV \to \RR\) and a real inner product \(\langle \cdot, \cdot \rangle : \VV \times \VV \to \RR\). The norm in general is not necessarily induced by the inner product. The Euclidean norm is denoted by \(\| \cdot \|_2\).

7.12.1. Types of Separating Hyperplanes#

Definition 7.59 (Separating hyperplane)

Let \(\VV\) be a real \(n\)-dimensional vector space. For any two subsets \(C\) and \(D\) of \(\VV\), a hyperplane \(H\) is said to separate \(C\) and \(D\) if \(C\) is contained in one of the closed halfspaces corresponding to \(H\) and \(D\) is contained in the other closed halfspace.

Let \(H\) be given by

\[ H = \{ \bx \in \VV \ST \langle \bx, \ba \rangle = b \} \]

where \(\ba \in \VV^*\) and \(b \in \RR\).

If \(C\) and \(D\) are separated by \(H\), then either

\[ \langle \bx_1, \ba \rangle \leq b \leq \langle \bx_2, \ba \rangle \Forall \bx_1 \in C, \bx_2 \in D \]

or

\[ \langle \bx_2, \ba \rangle \leq b \leq \langle \bx_1, \ba \rangle \Forall \bx_1 \in C, \bx_2 \in D \]

holds true.

We can also say that the two sets \(C\) and \(D\) can be separated by a hyperplane or there exists a hyperplane separating \(C\) and \(D\) if there exists a nonzero vector \(\ba \in \VV^*\) such that

\[ \sup_{\bx \in C} \langle \bx, \ba \rangle \leq \inf_{\bx \in D} \langle \bx, \ba \rangle. \]

Any choice of \(b \in [l,u]\) where \(l = \sup_{\bx \in C} \langle \bx, \ba \rangle\) and \(u = \inf_{\bx \in D} \langle \bx, \ba \rangle\) describes a separating hyperplane.

This definition allows for some degenerate possibilities where both \(C \subseteq H\) and \(D \subseteq H\) since the closed halfspaces contain \(H\) (as their boundary).

Definition 7.60 (Proper separation)

Let \(\VV\) be a real \(n\)-dimensional vector space. For any two subsets \(C\) and \(D\) of \(\VV\), a hyperplane \(H\) is said to properly separate \(C\) and \(D\) if \(H\) separates them and both are not entirely contained in \(H\); i.e., either \(C \not\subseteq H\) or \(D \not\subseteq H\) or both.

Proper separation still allows for the possibility that parts of \(C\) or \(D\) lies inside \(H\). But it ensures that \(C \triangle D \neq \EmptySet\). \(C \cap D \subseteq H\) is the common part.

If \(C\) and \(D\) are properly separated by \(H\), then they are also separated by \(H\). Hence

\[ \langle \bx_1, \ba \rangle \leq b \leq \langle \bx_2, \ba \rangle \Forall \bx_1 \in C, \bx_2 \in D \]

or

\[ \langle \bx_2, \ba \rangle \leq b \leq \langle \bx_1, \ba \rangle \Forall \bx_1 \in C, \bx_2 \in D \]

holds true. Also, there exists at least one \(\bx_1 \in C\) and \(\bx_2 \in D\) such that at least one of these inequalities is a strict inequality thus ensuring that either \(C\) or \(D\) or both are not contained entirely in \(H\).

We can also say that the two sets \(C\) and \(D\) can be properly separated by a hyperplane or there exists a hyperplane properly separating \(C\) and \(D\) if there exists a nonzero vector \(\ba \in \VV^*\) such that

\[ \sup_{\bx \in C} \langle \bx, \ba \rangle \leq \inf_{\bx \in D} \langle \bx, \ba \rangle. \]

and

\[ \inf_{\bx \in C} \langle \bx, \ba \rangle < \sup_{\bx \in D} \langle \bx, \ba \rangle. \]

Any choice of \(b \in [l,u]\) where \(l = \sup_{\bx \in C} \langle \bx, \ba \rangle\) and \(u = \inf_{\bx \in D} \langle \bx, \ba \rangle\) describes a properly separating hyperplane. This characterization is proved in Theorem 7.161 below.

If a convex set has a nonempty interior, then it cannot be contained in a hyperplane. Two disjoint sets one of which has a nonempty interior can indeed be properly separated.

Definition 7.61 (Strong separation)

Let \(\VV\) be a real \(n\)-dimensional vector space. For any two subsets \(C\) and \(D\) of \(\VV\), a hyperplane \(H\) is said to strongly separate \(C\) and \(D\) if there exists an \(r > 0\) such that \(C + r B\) is contained in one of the open halfspaces corresponding to \(H\) and \(D + rB\) is contained in the other open halfspace.

Under these assumptions, one set lies in the interior of \(H_{++}\) and the other set lies in the interior of \(H_{--}\). Under strong separation, \(C \cap D = \EmptySet\) is guaranteed since \(H_{++} \cap H_{--} = \EmptySet\).

We can also say that the two sets \(C\) and \(D\) can be strongly separated by a hyperplane or there exists a hyperplane strongly separating \(C\) and \(D\) if there exists a nonzero vector \(\ba \in \VV^*\) such that

\[ \sup_{\bx \in C} \langle \bx, \ba \rangle < \inf_{\bx \in D} \langle \bx, \ba \rangle. \]

Any choice of \(b \in (l,u)\) where \(l = \sup_{\bx \in C} \langle \bx, \ba \rangle\) and \(u = \inf_{\bx \in D} \langle \bx, \ba \rangle\) describes a strongly separating hyperplane.

Strong separation is called as strict separation in [5].

Remark 7.9

Separation is translation invariant. For any \(\ba \in \VV\) and convex sets \(C, D \subseteq \VV\),

\(C\) and \(D\) are separated by a hyperplane \(H\) if and only if \(C - \ba\), \(D -\ba\) are separated by the hyperplane \(H - \ba\).

7.12.2. Disjoint Affine and Convex Sets#

Separation between sets can be discussed at several levels. We start with a simple case where an affine set and a relatively open convex set are disjoint.

Theorem 7.160 (Disjoint affine and relatively open convex sets)

Let \(\VV\) be a real \(n\)-dimensional vector space. Let \(C\) be a nonempty relatively open convex set of \(\VV\). Let \(M \subseteq \VV\) be an affine subspace such that \(M \cap C = \EmptySet\). Then, there exists a hyperplane \(H\) such that \(M \subseteq H\) and one of the two open halfspaces associated with \(H\) contains \(C\).

Proof. Consider the case where \(M\) is a hyperplane. Then, \(H = M\) and the result is immediate.

  1. If for contradiction, \(C\) is not contained in one of the open halfspaces of \(M\).

  2. Then, \(C \cap H \neq \EmptySet\) since \(H\) is the boundary of the halfspaces.

  3. It will contradict our hypothesis that \(C \cap M = \EmptySet\).

  4. Thus, \(C\) must be in one of the open halfspaces of \(H\).

Now, assume that \(M\) is not a hyperplane.

Without loss of generality, assume that \(\bzero \in M\). We can do this by picking any element \(\bm\) of \(M\) and replacing \(M\) by \(M - \bm\) and \(C\) by \(C - \bm\). Thus, \(M\) is a subspace.

The proof is constructive. We iteratively construct subspaces of increasing dimension which contain \(M\) but don’t contain \(C\) till we reach a hyperplane.

  1. Let \(\dim M = d\) where \(d < n\).

  2. Let \(S = M\) be the initial subspace.

  3. Let \(r=n-1-d\) be the number of iterations (if \(M\) is a hyperplane, no iterations are needed).

  4. For \(i=1,\dots,r\):

    1. Replace \(S\) by a subspace \(S'\) such that \(\dim S' = \dim S + 1\) and \(S' \cap C = \EmptySet\).

    2. Let \(S = S'\)

  5. We have arrived with a hyperplane \(S\) such that \(S \cap C = \EmptySet\).

Towards this, let us look at the procedure to construct \(S'\) from \(S\) such that \(\dim S' = \dim S + 1\) and \(S \cap C = \EmptySet\). Here is the outline

  1. Let \(S^{\perp}\) be the orthogonal complement of \(S\).

  2. Pick any 2D subspace \(P\) of \(S^{\perp}\).

  3. Pick a line \(L\) in \(P\) which passes through origin but doesn’t intersect with \(C\); i.e. \(L \cap C = \EmptySet\).

  4. Construct the subspace \(S' = S \oplus L\). The direct sum is valid since \(S \cap L = \{ \bzero \}\).

  5. Note that \(\dim S' = \dim S + 1\) and \(S' \cap C = \EmptySet\) since \(S \cap C = \EmptySet\) as well as \(L \cap C = \EmptySet\).

We now elaborate the procedure for finding the line \(L\):

  1. We have \(S\) as a linear subspace with \(\dim S < n -1\) and \(S \cap C = \EmptySet\).

  2. \(\dim S^{\perp} = n - \dim S > n - (n - 1) = 1\).

  3. Since \(\dim S^{\perp} > 1\), it contains a two dimensional subspace \(P\).

  4. Consider the set \(C' = P \cap (C - S)\).

  5. The set \(C-S\) has some properties.

    1. \(C-S\) is convex since it is the sum of two convex sets.

    2. Since \(\bzero \in S\), hence \(C = C - \bzero \subseteq C - S\).

    3. Since \(C \cap S = \EmptySet\), hence \(\bzero \notin C - S\).

    4. Note that due to Theorem 7.157:

      \[ \relint (C - S) = \relint C - \relint S = C - S \]

      since \(C\) is relatively open and \(S\) is affine, hence relatively open.

    5. Thus, \(C - S\) is relatively open.

  6. Consequently, the set \(C'\) also has some properties.

    1. \(C'\) is convex since it is the intersection of two convex sets.

    2. Since \(\bzero \notin C - S\), hence, \(\bzero \not\in C' = P \cap (C - S)\).

    3. \(C'\) is relatively open.

      1. If \(C'\) is empty, then \(C'\) is relatively open vacuously.

      2. Otherwise, by Theorem 7.155:

      \[ \relint C' = \relint (P \cap (C - S)) = P \cap \relint (C - S) = P \cap (C- S). \]

      Thus, \(\relint C' = C'\) implies that \(C'\) is relatively open.

  7. We now seek a line \(L \subseteq P\) passing through the origin \(\bzero\) that doesn’t intersect \(C'\).

  8. We note that \(L \cap C' = \EmptySet \implies L \cap C = \EmptySet\).

    1. Suppose for contradiction that \(L \cap C' = \EmptySet\) but \(L \cap C \neq \EmptySet\).

    2. Let \(\bx \in L \cap C\).

    3. Then, \(\bx \in L \subseteq P\).

    4. Also, \(\bx \in C \subseteq C - S\).

    5. Thus, \(\bx \in C' = P \cap (C - S)\).

    6. This means that \(\bx \in L \cap C'\).

    7. In other words, \(L \cap C' \neq \EmptySet\) which contradicts our assumption.

  9. How to choose \(L\)? There are three possibilities:

    1. \(C'\) is empty.

    2. \(C'\) is contained in a line; i.e., \(\dim \affine C' = 1\).

    3. \(\affine C' = P\) and \(\dim \affine C' = 2\).

    \(\affine C'\) cannot be larger than \(P\) since \(C' \subseteq P\) and \(P\) is affine.

  10. If \(C'\) is empty, then any line in \(P\) passing through origin will do.

  11. If \(\affine C\) is a line.

    1. If \(\affine C'\) passes through the origin, we can take a line perpendicular to \(\affine C'\) passing through origin. In this case, \(L \cap \affine C' = \{\bzero \}\). Since \(\bzero \notin C'\), hence \(L \cap C' = \EmptySet\).

    2. If \(\affine C'\) is a line that doesn’t pass through \(\bzero\), we can simply take a line that is parallel to \(\affine C'\) and passes through \(\bzero\). In this case \(L \cap \affine C' = \EmptySet\). Consequently, \(L \cap C' = \EmptySet\).

  12. If \(\affine C'\) is the entire 2D subspace \(P\).

    1. Consider the set \(K = \bigcup \{t C' \ST t > 0 \}\).

    2. \(C' \subseteq K\) (specifically, for \(t = 0\)).

    3. \(K\) is a convex cone containing \(C'\) but not containing \(\bzero\).

    4. \(K\) is relatively open since \(C'\) is relatively open.

    5. A boundary ray of \(K\) doesn’t intersect with \(C'\) since \(K\) is relatively open.

    6. Take \(L\) to be any of the two boundary rays of \(K\) and extend it to a line passing through \(\bzero\).

7.12.3. Proper Separation Characterization#

Theorem 7.161 (Characterization of proper separation)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be nonempty subsets of \(\VV\). There exists a hyperplane \(H\) that separates \(S\) and \(T\) properly if and only if there exists a vector \(\ba \in \VV\) such that

\[ \inf \{ \langle \bx, \ba \rangle \ST \bx \in S \} \geq \sup \{ \langle \bx, \ba \rangle \ST \bx \in T \} \]

and

\[ \sup \{ \langle \bx, \ba \rangle \ST \bx \in S \} > \inf \{ \langle \bx, \ba \rangle \ST \bx \in T \}. \]

Proof. Assume that there is some vector \(\ba \in \VV\) satisfying the conditions above.

  1. Since \(\ba\) satisfies the second strict inequality, hence \(\ba \neq \bzero\). Otherwise, the second inequality cannot hold.

  2. Choose any \(b \in \RR\) such that

    \[ \underset{\bx \in S}{\inf}\{ \langle \bx, \ba \rangle \} \geq b \geq \underset{\bx \in S}{\sup}\{ \langle \bx, \ba \rangle \}. \]
  3. Then, the set \(H = \{ \bx \ST \langle \bx, \ba \rangle = b \}\) is a hyperplane.

  4. Let \(H_+ = \{ \bx \ST \langle \bx, \ba \rangle \geq b \}\) and \(H_- = \{ \bx \ST \langle \bx, \ba \rangle \leq b \}\).

  5. Clearly, \(S \subseteq H_+\) and \(T \subseteq H_-\).

  6. The second strict inequality ensures that both \(S\) and \(T\) cannot be contained in \(H\).

    1. For contradiction, assume \(S \subseteq H\) and \(T \subseteq H\)

    2. Then \(\sup \{ \langle \bx, \ba \rangle \ST \bx \in S \} = b\) and \(\inf \{ \langle \bx, \ba \rangle \ST \bx \in T \} = b\).

    3. Thus, the second inequality will not hold.

  7. Thus, \(S\) and \(T\) are properly separated.

Now, assume that \(S\) and \(T\) are properly separated by some hyperplane \(H\).

  1. Let \(H = \{ \bx \ST \langle \bx, \ba \rangle = b \}\) be the specification of the said hyperplane.

  2. Let \(H_+ = \{ \bx \ST \langle \bx, \ba \rangle \geq b \}\) and \(H_- = \{ \bx \ST \langle \bx, \ba \rangle \leq b \}\).

  3. Without loss of generality, assume that \(S \subseteq H_+\) and \(T \subseteq H_+\).

  4. Then, \(\langle \bx, \ba \rangle \geq b\) for every \(\bx \in S\).

  5. Thus, \(\inf \{ \langle \bx, \ba \rangle \ST \bx \in S \} \geq b\).

  6. Similarly, \(\langle \bx, \ba \rangle \leq b\) for every \(\bx \in T\).

  7. Thus, \(\sup \{ \langle \bx, \ba \rangle \ST \bx \in T \} \leq b\).

  8. Thus, the first inequality is satisfied.

  9. Since \(H\) properly separates \(S\) and \(T\), hence either \(S \not\subseteq H\) or \(T \not\subseteq H\).

  10. If \(S \not\subseteq H\), then there exists \(\bx \in S\) such that \(\langle \bx, \ba \rangle > b\).

  11. Consequently, \(\sup \{ \langle \bx, \ba \rangle \ST \bx \in S \} > b\).

  12. If \(T \not\subseteq H\), then there exists \(\bx \in T\) such that \(\langle \bx, \ba \rangle < b\).

  13. Consequently, \(\inf \{ \langle \bx, \ba \rangle \ST \bx \in T \} < b\).

  14. Combining,

    \[ \sup \{ \langle \bx, \ba \rangle \ST \bx \in S \} > \inf \{ \langle \bx, \ba \rangle \ST \bx \in T \} \]

    must be true.

7.12.3.1. Proper Separation : Set and Point#

Remark 7.10 (Proper separation between a set and a point)

Let \(\VV\) be an \(n\)-dimensional real vector space. We consider the case of proper separation between a set \(S\) and a point \(\bp\). We can form a set \(T = \{ \bp \}\). Then the proper separation between \(S\) and \(\bp\) can be described in terms of proper separation between \(S\) and \(T\).

Then

\[ \sup \{ \langle \bx, \ba \rangle \ST \bx \in T \} = \inf \{ \langle \bx, \ba \rangle \ST \bx \in T \} = \langle \bp, \ba \rangle. \]

Hence, \(S\) and \(\bp\) are properly separated if and only if there exists a vector \(\ba \in \VV\) such that

\[ \inf \{ \langle \bx, \ba \rangle \ST \bx \in S \} \geq \langle \bp, \ba \rangle \text{ and } \sup \{ \langle \bx, \ba \rangle \ST \bx \in S \} > \langle \bp, \ba \rangle. \]

We now consider a hyperplane

\[ H = \{ \bx \in \VV \ST \langle \bx, \ba \rangle = \langle \bp, \ba \rangle \}. \]

And let \(H_+\) denote one of its closed half-spaces

\[ H_+ = \{ \bx \in \VV \ST \langle \bx, \ba \rangle \geq \langle \bp, \ba \rangle \}. \]

We can clearly see that

  1. \(\ba \in H\).

  2. \(S \subseteq H_+\). \(S\) is contained entirely in the closed half-space \(H_+\).

  3. \(S \not\subseteq H\). \(S\) is not contained entirely in \(H\).

Thus, if a set and a point are properly separated, then there exists a hyperplane that passes through the point, contains the set in one of its half-spaces but does not fully contain the set itself.

7.12.4. Separating Hyperplane Theorems#

Theorem 7.162 (Separating hyperplane theorem I)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be nonempty convex subsets of \(\VV\). There exists a hyperplane \(H\) that separates \(S\) and \(T\) properly if and only if \(\relint S \cap \relint T = \EmptySet\).

In other words, two sets are properly separated if and only if their relative interiors are disjoint.

Proof. Let \(C = S - T\).

  1. By Theorem 7.142, \(\relint S\) and \(\relint T\) are nonempty since \(S\) and \(T\) are nonempty convex sets.

  2. By Theorem 7.157, \(\relint C = \relint S - \relint T\) and it is nonempty.

  3. Note that, \(\bzero \notin \relint C\) if and only if \(\relint S \cap \relint T = \EmptySet\).

    1. If \(\bzero \in \relint C\) then there exists \(\bx \in \relint S \cap \relint T\) such that \(\bx - \bx = \bzero\).

Now assume that \(\relint S \cap \relint T = \EmptySet\).

  1. Thus, \(\bzero \notin \relint C\).

  2. Consider the affine set \(M = \{ \bzero \}\).

  3. Clearly, \(M \cap \relint C = \EmptySet\) and \(\relint C\) is relatively open.

  4. By Theorem 7.160, there exists a hyperplane containing \(M\) such that \(\relint C\) is a subset of one of its associated open halfspaces.

  5. Let \(H\) be this hyperplane given by \(H = \{\bx \ST \langle \bx, \ba \rangle = 0\}\). Since the hyperplane contains \(M\), hence it is a subspace.

  6. By Theorem 7.149, \( C \subseteq \closure C \subseteq \closure \relint C\).

  7. Thus, \(C\) is contained in the corresponding closed halfspace.

  8. Without loss of generality, assume that \(C\) is contained in the nonnegative halfspace \(H_+\).

  9. Thus, \(\inf \{ \langle \bx, \ba \rangle \ST \bx \in C\} \geq 0\).

  10. This means that

    \[ \inf \{ \langle \bx, \ba \rangle \ST \bx \in C\} = \inf \{ \langle \bx, \ba \rangle \ST \bx \in S\} - \sup \{ \langle \bx, \ba \rangle \ST \bx \in T\} \geq 0. \]
  11. Thus,

    \[ \inf \{ \langle \bx, \ba \rangle \ST \bx \in S\} \geq \sup \{ \langle \bx, \ba \rangle \ST \bx \in T\}. \]
  12. Since \(\relint C \in H_{++}\), there exists \(\bx \in C\), such that \(\langle \bx, \ba \rangle > 0\).

  13. Thus, \(\sup \{ \langle \bx, \ba \rangle \ST \bx \in C\} > 0\).

  14. But then,

    \[ \sup \{ \langle \bx, \ba \rangle \ST \bx \in C\} = \sup \{ \langle \bx, \ba \rangle \ST \bx \in S\} - \inf \{ \langle \bx, \ba \rangle \ST \bx \in T\} > 0. \]
  15. Thus,

    \[ \sup \{ \langle \bx, \ba \rangle \ST \bx \in S\} > \inf \{ \langle \bx, \ba \rangle \ST \bx \in T\}. \]
  16. Then, by Theorem 7.161, \(S\) and \(T\) are properly separated.

Now, for the converse, assume that \(S\) and \(T\) are properly separated. Thus, there exists a vector \(\ba \in \VV\) such that

\[ \inf \{ \langle \bx, \ba \rangle \ST \bx \in S \} \geq \sup \{ \langle \bx, \ba \rangle \ST \bx \in T \} \]

and

\[ \sup \{ \langle \bx, \ba \rangle \ST \bx \in S \} > \inf \{ \langle \bx, \ba \rangle \ST \bx \in T \}. \]
  1. From the first inequality, we get that \(\inf \{ \langle \bx, \ba \rangle \ST \bx \in C\} \geq 0\).

  2. From the second inequality, we get that \(\sup \{ \langle \bx, \ba \rangle \ST \bx \in C\} > 0\).

  3. Thus, there exists a hyperplane \(H\) given by \(H = \{\bx \ST \langle \bx, \ba \rangle = 0\}\) such that the corresponding nonnegative closed halfspace \(H_+ = \{\bx \ST \langle \bx, \ba \rangle \geq 0\}\) contains \(C\).

  4. Note that

    \[ \interior H_+ = \relint H_+ = H_{++} = \{\bx \ST \langle \bx, \ba \rangle > 0\}. \]
  5. Since \(\sup \{ \langle \bx, \ba \rangle \ST \bx \in C\} > 0\), hence \(H_{++} \cap C \neq \EmptySet\).

  6. \(C \subseteq \closure H_{++} = H_+\) but \(C \not\subseteq \closure H_{++} \setminus \relint H_{++} = H_+ \setminus H_{++} = H\).

  7. Thus, by Theorem 7.156, \(\relint C \subseteq \relint H_{++} = H_{++}\).

  8. Thus, \(\bzero \notin \relint C\).

  9. Thus, \(\relint S \cap \relint T = \EmptySet\).

We are done.

Corollary 7.10 (Separating hyperplane theorem II)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be nonempty disjoint convex subsets of \(\VV\); i.e., \(S \cap T = \EmptySet\).

Then, there exists a hyperplane that properly separates them.

Proof. Since \(S \cap T = \EmptySet\), hence \(\relint S \cap \relint T = \EmptySet\). Then, applying Theorem 7.162, there exists a hyperplane that properly separates \(S\) and \(T\).

This result is stronger than proposition 2.4.2 in [5].

Disjointness of convex sets is not enough for strong separation as their closures might meet.

Theorem 7.163 (Separating hyperplane theorem III)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) be a nonempty convex subset of \(\VV\). Let \(\ba \in \VV\) be a vector. There exists a hyperplane \(H\) that separates \(S\) and \(\ba\) properly if and only if \(\ba \notin \relint S\).

In other words, a point can be properly separated from a convex set if and only if it doesn’t belong to its relative interior.

Proof. .

  1. We form a set \(T = \{ \ba \}\).

  2. Then \(\relint T = \{ \ba \}\) (Theorem 7.133).

  3. By Theorem 7.162, \(S\) and \(T\) are properly separated if and only if

    \[ \relint S \cap \relint T = \EmptySet. \]
  4. In other words,

    \[ \relint S \cap \{ \ba \} = \EmptySet. \]
  5. In other words, \(\ba \notin \relint S\).

7.12.5. Strong Separation Characterization#

Theorem 7.164 (Characterization of strong separation)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be nonempty convex subsets of \(\VV\). There exists a hyperplane \(H\) that separates \(S\) and \(T\) strongly if and only if

\[ \inf \{ \| \bx - \by \| \ST \bx \in S, \by \in T \} > 0. \]

In other words, \(H\) strongly separates \(S\) and \(T\) if and only if \(\bzero \notin \closure (S - T)\).

Proof. Let \(C = S-T\). Then

\[ \inf \{ \| \bx - \by \| \ST \bx \in S, \by \in T \} = \inf \{ \| \bv \| \ST \bv \in C \}. \]

By Theorem 4.51, \(\inf \{ \| \bv \| \ST \bv \in C \} = 0\) if and only if \(\bzero \in \closure C\).

Thus, \(\inf \{ \| \bv \| \ST \bv \in C \} > 0\) if and only if \(\bzero \notin \closure C\).

Assume that \(S\) and \(T\) are strongly separated. Let \(H\) be the hyperplane that separates \(S\) and \(T\). Let \(H_{++}\) be the positive open halfspace. Let \(H_{--}\) be the negative open halfspace.

  1. Then, there exists an \(r > 0\) such that \(S + r B \subseteq H_{++}\) and \(T + rB \subseteq H_{--}\).

  2. Since \(H_{++} \cap H_{--} = \EmptySet\), hence \((S + r B) \cap (T + r B) = \EmptySet\).

  3. Then, for any \(\bx \in S\) and \(\by \in T\) and every \(\bu, \bv \in B\),

    \[ \| \bx + r \bu - (\by + r\bv ) \| > 0 \]

    as \(\bx + r \bu = \by + r \bv\) would mean that \((S + r B) \cap (T + r B) \neq \EmptySet\)

  4. Note that, \(\| \bx + r \bu - (\by + r\bv ) \| = \| (\bx - \by) - r (\bv - \bu) \|\).

  5. Then, \(\| \bx - \by \| \geq 2 r\) must be true for every \(\bx \in S\) and \(\by \in T\).

    1. For contradiction, assume that \(\| \bx - \by \| < 2 r\) for some \(\bx \in S\) and \(\by \in T\).

    2. Let \(\bu = \frac{1}{2r} (\by - \bx)\).

    3. Let \(\bv = \frac{1}{2r} (\bx - \by)\).

    4. Note that \(\| \bu \| < 1\) and \(\| \bv \| < 1\).

    5. Thus, \(\bu, \bv \in B\).

    6. Then, \(r (\bv - \bu) = \bx - \by\).

    7. Then, \(\| (\bx - \by) - r (\bv - \bu) \| = \| \bzero \| = 0\).

    8. This contradicts the condition that \(\| \bx + r \bu - (\by + r\bv ) \| > 0\) for every \(\bx \in S\), \(\by \in T\) and \(\bu, \bv \in B\).

  6. Thus, \(\inf \{ \| \bx - \by \| \ST \bx \in S, \by \in T \} > 0\).

Conversely, assume that \(\inf \{ \| \bx - \by \| \ST \bx \in S, \by \in T \} > 0\).

  1. Let \(\inf \{ \| \bx - \by \| \ST \bx \in S, \by \in T \} = 2 r\) where \(r > 0\).

  2. Then, \(\| \bx - \by \| \geq 2 r \Forall \bx \in S, \by \in T\).

  3. Then, \((S + rB) \cap (T + rB) = \EmptySet\).

    1. For contradiction, assume that \((S + rB) \cap (T + rB) \neq \EmptySet\).

    2. Let \(\ba \in (S + rB) \cap (T + rB)\).

    3. Then, there exists \(\bx \in S\) and \(\bu \in B\) such that \(\ba = \bx + r \bu\).

    4. And, there exists \(\by \in T\) and \(\bv \in B\) such that \(\ba = \by + r \bv\).

    5. Then, \(\bx + r \bu = \by + r \bv\).

    6. Thus, \(\bx - \by = r (\bv - \bu)\).

    7. Thus,

      \[ \| \bx - \by \| = r \| \bv - \bu \| \leq r (\| \bv \| + \| \bu \|) < r (1 + 1) = 2 r. \]
    8. This contradictions our hypothesis that \(\| \bx - \by \| \geq 2 r \Forall \bx \in S, \by \in T\).

  4. Note that \(S +rB\) and \(T + rB\) are convex and disjoint.

  5. Then, due to Corollary 7.10, there exists a hyperplane \(H\) which separates \(S + rB\) and \(T + rB\) properly.

  6. We can choose \(H\) so that \(S + rB \subseteq H_+\) and \(T + rB \subseteq H_-\).

  7. But then, \(S + \frac{r}{2} B \in H_{++}\) and \(T + \frac{r}{2} B \in H_{--}\) lie in the opposite open halfspaces.

  8. Thus, \(S\) and \(T\) are strongly separated.

7.12.6. Strong Separation Conditions#

We describe several conditions which are sufficient to achieve strong separation between two sets. See also strict separation theorem (proposition 2.4.3) of [5].

Theorem 7.165 (Strong separation: closed subtraction)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be two disjoint nonempty convex subsets of \(\VV\). There exists a hyperplane \(H\) that separates \(S\) and \(T\) strongly if \(S - T\) is closed.

Proof. We proceed as follows.

  1. Since \(S\) and \(T\) are disjoint hence \(\bzero \notin S-T\).

  2. Since \(S-T\) is closed, hence \(\bzero \notin \closure (S - T) = S - T\).

  3. By Theorem 7.164, \(S\) and \(T\) are strongly separated.

Theorem 7.166 (Strong separation: closed and compact sets)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be two disjoint nonempty convex subsets of \(\VV\). There exists a hyperplane \(H\) that separates \(S\) and \(T\) strongly if \(S\) is closed and \(T\) is compact.

Proof. We proceed as follows.

  1. Since \(S\) and \(T\) are disjoint hence \(\bzero \notin S-T\).

  2. Since \(S-T\) is closed, hence \(\bzero \notin \closure (S - T) = S - T\).

  3. By Theorem 7.164, \(S\) and \(T\) are strongly separated.

7.12.6.1. Recession Cones#

The conditions below are based on the theory of recession cones and lineality spaces of convex sets. See Recession Cones for a background.

Theorem 7.167 (Strong separation: recession cones)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be two disjoint nonempty closed and convex subsets of \(\VV\). There exists a hyperplane \(H\) that separates \(S\) and \(T\) strongly if

\[ R_S \cap R_T = L_S \cap L_T \]

where \(R_X\) denotes the recession cone and \(L_X\) denotes the lineality space of a set \(X\).

Proof. Due to Corollary 7.18, \(S - T\) is closed. Then due to Theorem 7.165, \(S\) and \(T\) are strongly separated.

7.12.6.2. Strongly Separating Hyperplane#

Recall that the (orthogonal) projection of a vector \(\bv\) on a convex set \(C\) is the vector \(\bx \in C\) which is nearest to \(\bv\) under the (Euclidean) norm. In particular, if \(\bv \in C\) then \(\bv\) is its own projection on \(C\). The projection of the vector \(\bzero\) on a convex set \(C\) is this the vector of \(C\) with the minimum norm. See Projection on Convex Sets for details.

Theorem 7.168 (Strongly separating hyperplane)

Let \(\VV\) be an \(n\)-dimensional real vector space. Let \(S\) and \(T\) be two disjoint nonempty and convex subsets of \(\VV\). Assume that \(S - T\) is closed.

Consider the vector of minimum norm (projection of the origin) in \(S - T\) given by \(\bv = \bs - \bt\) where \(\bs \in S\) and \(\bt \in T\).

  1. Let \(\ba = \frac{1}{2} \bv = \frac{1}{2} (\bs - \bt)\).

  2. Let \(\bb = \frac{1}{2} (\bs + \bt)\).

  3. Let \(c = \langle \bb, \ba \rangle\).

Then the hyperplane \(H\) given by

\[ H = \{ \bx \in \VV \ST \langle \bx, \ba \rangle = c \} \]

strongly separates \(S\) and \(T\). In other words,

\[ \langle \bx_1, \ba \rangle > c > \langle \bx_2, \ba \rangle \quad \Forall \bx_1 \in S, \bx_2 \in T. \]

Proof. .

  1. Since \(S\) and \(T\) are disjoint hence \(\bs - \bt \neq \bzero\).

  2. Hence \(\ba \neq \bzero\).

  3. \(\bs\) is the nearest point in \(\closure S\) from \(\closure T\).

  4. \(\bt\) is the nearest point in \(\closure T\) from \(\closure S\).

  5. The line segment \([\bs, \bt]\) connects these nearest points.

  6. \(\bb\) lies on this line segment.

  7. Accordingly, \(\bs\) is projection of \(\bb\) in \(\closure S\).

  8. Similarly \(\bt\) is projection of \(\bb\) in \(\closure T\).

  9. Then, due to Theorem 8.7 (orthogonal projection characterization), for every \(\bx \in S\)

    \[ \langle \bx - \bs, \bb - \bs \rangle \leq 0. \]
  10. But

    \[ \bb - \bs = \frac{1}{2} (\bs + \bt) - \bs = \frac{1}{2} (\bt - \bs) = -\ba. \]
  11. Hence \(\langle \bx - \bs, \ba \rangle \geq 0\).

  12. Hence \(\langle \bx, \ba \rangle \geq \langle \bs, \ba \rangle\).

  13. Further

    \[\begin{split} \langle \bs, \ba \rangle &= \langle \bs - \bb + \bb, \ba \rangle \\ &= \langle \ba + \bb, \ba \rangle \\ &= \| \ba \|_2^2 + \langle \bb, \ba \rangle \\ &= \| \ba \|_2^2 + c > c. \end{split}\]

    since \(\ba \neq \bzero\).

  14. Hence for every \(\bx \in S\), we have \(\langle \bx, \ba \rangle > c\).

  15. A similar argument shows that for every \(\bx \in T\), we have \(\langle \bx, \ba \rangle < c\).

7.12.7. Closed Convex Sets#

Theorem 7.169 (Strict separation theorem)

Let \(\VV\) be a real \(n\)-dimensional vector space. Let \(C \subseteq \VV\) be a nonempty closed convex set. Let \(\by \notin C\). Then, there exists \(\bp \in \VV^*\) and \(\alpha \in \RR\) such that

\[ \langle \by , \bp \rangle > \alpha \text{ and } \langle \bx, \bp \rangle \leq \alpha \Forall \bx \in C. \]

In other words, there exists a separating hyperplane such that \(C\) is contained in one of its (closed) halfspaces and \(\by\) is not in that halfspace (i.e., it is in the opposite open halfspace).

By choosing any \(\beta \in (\alpha, \langle \by , \bp \rangle)\), we also have

\[ \langle \by , \bp \rangle > \beta \text{ and } \langle \bx, \bp \rangle < \beta \Forall \bx \in C. \]

Proof. This is an application of strong separation.

  1. Define a set \(D = \{ \by \}\).

  2. Since \(C\) is closed and \(\by \notin C\), hence \(\by\) is not a closure point of \(C\).

  3. Thus, there exists \(r > 0\) such that \(B(\by, r) \cap C = \EmptySet\).

  4. Thus, \(d(C, D) \geq r\).

  5. Then, by Theorem 7.164, \(C\) and \(D\) are strongly separated.

  6. Let \(H\) be a hyperplane which strongly separates them.

  7. Then one of the closed halfspaces of \(H\) contains \(C\) but not \(\by\).

  8. Let \(H\) be described by

    \[ H = \{ \bx \in \VV \ST \langle \bx, \bp \rangle = \alpha \}. \]
  9. We can negate \(\bp, \alpha\) if necessary so that

    \[ C \subseteq H_- = \{ \bx \in \VV \ST \langle \bx, \bp \rangle \leq \alpha \}. \]
  10. Accordingly, \(\langle \by, \bp \rangle > \alpha\).

Theorem 7.170 (Closed Convex = Intersection of Halfspaces)

Let \(\VV\) be a real \(n\)-dimensional vector space. A closed convex set \(C\) of \(\VV\) is the intersection of all the closed halfspaces that contain it.

Proof. The set \(\VV\) is closed and convex but no halfspace contains it. So the statement is vacuously true.

The empty set is closed and convex and every halfspace contains it. The intersection of all halfspaces
is the empty set. Thus, the statement is vacuously true.

Let us assume that \(C\) is nonempty and not equal to \(\VV\).

  1. Let \(\ba \in \VV\) such that \(\ba \notin C\).

  2. Since \(C\) is closed, hence \(\ba\) is not a closure point of \(C\).

  3. Thus, there exists \(r > 0\) such that \(B(\ba, r) \cap C = \EmptySet\).

  4. Thus, \(\| \ba - \bx \| > 0 \Forall \bx \in C\). Specifically, \(\inf \{ \| \ba - \bx \| \ST \bx \in C \} \geq r\).

  5. Thus, by Theorem 7.164, \(\{ \ba \}\) and \(C\) are strongly separated by some hyperplane \(H\).

  6. Thus, one of the corresponding closed halfspaces contains \(C\) but not \(\ba\).

  7. In other words, for every \(\ba \notin C\), there exists a closed halfspace that doesn’t contain \(\ba\) but contains \(C\).

  8. Thus, the intersection of all halfspaces containing \(C\) cannot contain any element from \(\VV \setminus C\).

    1. If the intersection contained an element \(\ba \notin C\), then there would be a closed halfspace containing \(C\) but not \(\ba\) which would mean that the intersection of all halfspaces containing \(C\) cannot contain \(\ba\).

  9. Hence, the intersection of all closed halfspaces containing \(C\) is exactly equal to \(C\).

We now have two different characterizations of closed convex sets.

  1. From Theorem 7.122, a closed and convex set is the closure of the union of all the line segments connecting the points of the set.

  2. From Theorem 7.170, a closed and convex set is the intersection of all the closed half-spaces containing it.

7.12.8. Supporting Hyperplanes#

Definition 7.62 (Supporting hyperplane and halfspaces)

Let \(\VV\) be a real \(n\)-dimensional vector space. Let \(S \subseteq \VV\). Let \(\bx_0 \in \boundary S\) be a point on its boundary.

If there exists a nonzero vector \(\ba \in \VV\) such that \(\langle \bx, \ba \rangle \leq \langle \bx_0, \ba \rangle\) for every \(\bx \in S\), then the hyperplane \(H\) given by

\[ H = \{ \bx \ST \langle \bx, \ba \rangle = \langle \bx_0, \ba \rangle \} \]

is called a supporting hyperplane of \(S\) at \(\bx_0\).

\(H\) separates \(\{ \bx_0 \}\) and \(S\) and \(H\) contains \(\bx_0\). The halfspace \( \{ \bx \ST \langle (\bx - \bx_0), \ba \rangle \leq 0 \}\) corresponding to \(H\) is called a supporting halfspace of \(S\) at \(\bx_0\).

Convex sets have this beautiful property that there exists a supporting hyperplane at every point in the boundary of the set.

Theorem 7.171 (Supporting hyperplane theorem)

Let \(\VV\) be a real \(n\)-dimensional vector space. Let \(C\) be a nonempty convex subset of \(\VV\). Let \(\bx \in \boundary C\) be any point in the boundary of \(C\). Then, there exists a supporting hyperplane to \(C\) at \(\bx\).

Proof. Consider the case where \(\interior C = \EmptySet\).

  1. Then, due to Theorem 7.125, \(\dim \affine C < n\).

  2. Thus, there exists a hyperplane \(H\) such that \(\affine C \subseteq H\).

  3. Since \(\VV\) is finite dimensional, hence \(H\) is closed (Theorem 4.168).

  4. Thus, \(\closure C \subseteq H\).

  5. Thus, \(\bx \in H\).

  6. This \(H\) trivially separates \(\{ \bx \}\) and \(C\) as both are contained in \(H\).

Now, assume that \(\interior C \neq \EmptySet\).

  1. \(\dim \affine C = n\).

  2. \(\relint C = \interior C\).

  3. Since \(\bx \in \boundary C\), hence the sets \(\{ \bx \}\) and \(\interior C\) are disjoint.

  4. \(\relint \{ \bx \} = \{ \bx \}\) (Theorem 7.133).

  5. We have \(\relint \{ \bx \} \cap \relint C = \EmptySet\).

  6. By Theorem 7.162, there exists a hyperplane \(H\) that separates \(\{\bx \}\) and \(C\) properly.

  7. Consequently \(C\) lies entirely in one of the closed halfspaces of \(H\).

Corollary 7.11

Let \(\VV\) be a real \(n\)-dimensional vector space. A closed convex set \(C\) of \(\VV\) is the intersection of all the supporting halfspaces that contain it.