18.8. Restricted Isometry Property#

This section dwells deep into the implications of restricted isometry property.

Definition 18.31 (Restricted isometry property)

A matrix \(\Phi \in \CC^{M\times N}\) is said to satisfy the RIP (restricted isometry property) of order \(K\) with a constant \(\delta \in (0, 1)\) if the following holds:

(18.43)#\[(1 - \delta) \|\bx\|^2_2 \leq \| \Phi \bx \|^2_2 \leq (1 + \delta) \|\bx\|^2_2\]

for every \(\bx \in \Sigma_K\) where

\[ \Sigma_K = \{\bx \in \CC^N : \|\bx\|_0 \leq K \} \]

is the set of all \(K\)-sparse vectors in \(\CC^N\).

Definition 18.32 (Restricted isometry constant)

If a matrix \(\Phi \in \CC^{M\times N}\) satisfies RIP of order \(K\) then the smallest value of \(\delta\) (denoted as \(\delta_K\)) for which the following holds

(18.44)#\[(1 - \delta) \|\bx\|^2_2 \leq \| \Phi \bx \|^2_2 \leq (1 + \delta) \|\bx\|^2_2 \; \Forall \bx \in \Sigma_K\]

is known as the \(K\)-th restricted isometry constant for \(\Phi\). It is also written in short as \(K\)-th RIP constant. We write the bounds as in terms of \(\delta_K\) as

(18.45)#\[(1 - \delta_K) \|\bx\|^2_2 \leq \| \Phi \bx \|^2_2 \leq (1 + \delta_K) \|\bx\|^2_2 \; \Forall \bx \in \Sigma_K.\]

Some remarks are in order.

  • \(\Phi\) maps a vector \(\bx \in \Sigma_K \subseteq \CC^N\) into \(\CC^M\) as a vector \(\Phi \bx\) (usually \(M < N\)).

  • We will call \(\Phi \bx \in \CC^M\) as an embedding of \(\bx \in \CC^N\) into \(\CC^M\).

  • RIP quantifies the idea as to how much the squared length of a sparse signal changes during this embedding process.

  • We can compare matrices satisfying RIP with orthonormal bases.

  • An orthonormal basis or the corresponding unitary matrix preserves the length of a vector exactly.

  • A matrix \(\Phi\) satisfying RIP of order \(K\) is able to preserve the length of \(K\) sparse signals approximately (the approximation range given by \(\delta_K\)).

  • In this sense we can say that \(\Phi\) implements a restricted almost orthonormal system [19].

  • By restricted we mean that orthonormality is limited to \(K\)-sparse signals.

  • By almost we mean that the squared length is not preserved exactly. Rather it is preserved approximately.

  • An arbitrary matrix \(\Phi\) need not satisfy RIP of any order at all.

  • If \(\Phi\) satisfies RIP of order \(K\) then it is easy to see that \(\Phi\) satisfies RIP of any order \(L < K\) (since \(\Sigma_L \subset \Sigma_K\) whenever \(L < K\)).

  • If \(\Phi\) satisfies RIP of order \(K\), then it may or many not satisfy RIP of order \(L > K\).

  • Restricted isometry constant is a function of sparsity level \(K\) of the signal \(\bx \in \CC^N\).

Example 18.37 (Restricted isometry constant)

As a running example in this section we will use following matrix

\[\begin{split} \Phi = \frac{1}{2} \begin{bmatrix} 1 & -1 & 1 & 1 & 1 & -1 & 1 & 1\\ -1 & -1 & 1 & 1 & -1 & 1 & 1 & -1\\ -1 & -1 & -1 & 1 & -1 & -1 & -1 & 1\\ -1 & 1 & 1 & 1 & 1 & -1 & -1 & -1 \end{bmatrix} \in \RR^{4 \times 8}. \end{split}\]

Consider

\[ \bx = \begin{pmatrix} -2 & 0 & 0 & 0 & 0 & -3 & -1 & 0 \end{pmatrix} \]

which is a \(3\)-sparse vector in \(\RR^8\). We have

\[ \by = \Phi \bx = \begin{pmatrix} 0 & -1 & 3 & 3 \end{pmatrix} \]

Now

\[ \|\bx \|_2^2 = 14, \quad \|\bx \|_2 = 3.7417 \]

and

\[ \|\by \|_2^2 = 19, \quad \| \by \|_2 = 4.3589. \]

We note that

\[ \frac{\| \by \|^2_2}{\|\bx \|^2_2} = 1.3571. \]

With this much information, all we can say that \(\delta_3 \geq .3571\) for this matrix \(\Phi\) if \(\Phi\) satisfies RIP of order \(3\) since we haven’t examined all possible \(3\)-sparse vectors.

Still what is comforting to note is that for this particular example, the distance hasn’t increased by a large factor.

For a given \(K\)-sparse vector \(\bx\), let \(J\) denote the support of \(\bx\); i.e.,

\[ J = \{ 1 \leq i \leq N \ST x_i \neq 0 \}. \]

In the running example

\[ J = \{ 1, 6, 7 \}. \]

We define \(\bx_J \in \CC^K\) to be the vector formed by keeping the elements in \(\bx\) indexed by \(J\) and dropping of other elements (the zero elements). Note that the order of elements is preserved. In the running example,

\[ \bx_J = \begin{pmatrix} -2 & -3 & -1 \end{pmatrix}. \]

Let \(\Phi_J\) be the corresponding sub-matrix by choosing columns from \(\Phi\) indexed by the set \(J\). Note that the order of columns is preserved. In the running example

\[\begin{split} \Phi_J = \frac{1}{2} \begin{bmatrix} 1 & -1 & 1\\ -1 & 1 & 1\\ -1 & -1 & -1\\ -1 & -1 & -1 \end{bmatrix} \in \RR^{4 \times 3}. \end{split}\]

It is easy to see that

\[ \by = \Phi \bx = \Phi_J \bx_J. \]

There are \(\binom{N}{K}\) ways of choosing a \(K\)-sparse support for \(\bx\). Thus we have to consider \(\binom{N}{K}\) corresponding submatrices \(\Phi_J\).

For each such submatrix \(\Phi_J\), the RIP bounds can be rewritten as

(18.46)#\[(1 - \delta_K) \|\bx\|^2_2 \leq \| \Phi_J \bx \|^2_2 \leq (1 + \delta_K) \|\bx\|^2_2\]

for every \(\bx \in \CC^K\). Note that

\[ \| \Phi_J \bx \|^2_2 = (\Phi_J \bx)^H (\Phi_J \bx) = \bx^H \Phi_J^H \Phi_J \bx. \]

Theorem 18.44

An \(M \times N\) matrix \(\Phi\) cannot satisfy RIP of order \(K > M\).

Proof. This comes from the fact that for a wide matrix \(\rank \Phi \leq M\).

  1. Since every \(\phi_j \in \CC^M\) hence any set of \(M+1\) columns in \(\Phi\) is linearly dependent.

  2. Thus there exists a nonzero \(M+1\) sparse signal \(\bx \in \CC^N\) such that \(\Phi \bx = \bzero\) (it belongs to the null space of the chosen \(M+1\) columns).

  3. RIP (18.43) requires that a nonzero vector be embedded as a nonzero vector.

  4. Thus \(\Phi\) cannot satisfy RIP of order \(M+1\).

  5. The argument can be easily extended for any \(K > M\).

Theorem 18.45

If \(\Phi\) satisfies RIP of order \(l\) then it satisfies RIP of order \(k\) where \(k < l\).

Proof. Every \(k\) sparse signal is also \(l\) sparse signal. Thus if \(\Phi\) satisfies RIP of order \(l\) then it automatically satisfies RIP of order \(k < l\).

Theorem 18.46

Let \(\Phi\) satisfy RIP of order \(k\) and \(l\) where \(k < l\). Then \(\delta_k \leq \delta_l\). In other words, restricted isometry constants are non-decreasing.

Proof. Since every \(k\) sparse signal is also \(l\) sparse signal, hence for every \(\bx \in \Sigma_k\) following must be satisfied

\[ (1 - \delta_k) \|\bx\|^2_2 \leq \| \Phi \bx \|^2_2 \leq (1 + \delta_k) \|\bx\|^2_2 \]

and

\[ (1 - \delta_l) \|\bx\|^2_2 \leq \| \Phi \bx \|^2_2 \leq (1 + \delta_l) \|\bx\|^2_2. \]

Since \(\delta_k\) is smallest such value for which these inequalities are satisfied hence \(\delta_l\) cannot be smaller than \(\delta_k\).

18.8.1. The First Restricted Isometry Constant#

We consider the simplest case where \(K=1\). We can write \(\Phi\) in terms of its column vectors

\[ \Phi = \begin{bmatrix} \phi_1 & \dots & \phi_N \end{bmatrix}. \]
  1. Now a \(1\)-sparse vector \(\bx\) consists of only one nonzero entry.

  2. Say that \(\bx\) is nonzero at index \(j\).

  3. Then \(\Phi \bx\) is nothing but \(x_j \phi_j\).

  4. With this the restricted isometry inequality can be written as

    \[ (1 - \delta_1) |x_j|^2 \leq \| x_j \phi_j \|_2^2 \leq (1 + \delta_1) |x_j|^2. \]
  5. Dividing by \(|x_j|^2\) we get

    \[ (1 - \delta_1) \leq \| \phi_j \|_2^2 \leq (1 + \delta_1). \]

Let us formalize this in the following theorem.

Theorem 18.47 (Restricted isometry constants of order 1)

If a matrix \(\Phi\) satisfies RIP of order \(K \geq 1\) then the squared lengths of columns of \(\Phi\) satisfy the following bounds

(18.47)#\[1 - \delta_1 \leq \| \phi_j \|_2^2 \leq 1 + \delta_1 \; \Forall 1 \leq j \leq N.\]

When \(\delta_1 = 0\) then all columns of \(\Phi\) are unit norm. Now if columns of \(\Phi\) span \(\CC^M\) then \(\Phi\) can also be considered as a dictionary for \(\CC^M\) (see Definition 18.4).

Remark 18.8

A dictionary (Definition 18.4) satisfies RIP of order 1 with \(\delta_1 = 0\).

18.8.2. Sums and Differences of Sparse Vectors#

Theorem 18.48

Let \(\bx , \by \in \CC^N\) with \(\bx \in \Sigma_k\) and \(\by \in \Sigma_l\); i.e., \( \| \bx \|_0 \leq k\) and \(\| \by \|_0 \leq l\). Then

\[ (1 - \delta_{k + l}) \| \bx \pm \by \|_2^2 \leq \| \Phi \bx \pm \Phi \by \|_2^2 \leq (1 + \delta_{k + l}) \| \bx \pm \by \|_2^2 \]

as long as \(\Phi\) satisfies RIP of order \(k + l\).

Proof. We know that

\[ \| \bx \pm \by \|_0 \leq \| \bx \|_0 + \| \by \|_0 \leq k + l. \]

Thus \(\bx \pm \by \in \Sigma_{k + l}\). The result follows.

18.8.3. Distance Between Sparse Vectors#

Let \(\bx, \by \in \Sigma_K\). Then clearly \(\bx - \by \in \Sigma_{2K}\).

The \(\ell_2\) distance between vectors is given by

\[ d(\bx, \by) = \| \bx - \by \|_2 = \sqrt{(\bx - \by)^H (\bx - \by)}. \]

Now if \(\Phi\) satisfies RIP of order \(2K\) then we can see that it approximately preserves \(\ell_2\) distances between \(K\)-sparse vectors.

Theorem 18.49 (Approximation preservation of distances)

Let \(\bx, \by \in \Sigma_K \subset \CC^N\). Let \(\Phi \bx , \Phi \by \in \CC^M\) be corresponding embeddings. If \(\Phi\) satisfies RIP of order \(2K\), then

\[ (1 - \delta_{2K}) d^2(\bx, \by) \leq d^2 (\Phi \bx, \Phi \by) \leq (1 + \delta_{2K}) d^2(\bx, \by). \]

Proof. Since \(\Phi\) satisfies RIP of order \(2K\) hence for every vector \(\bv \in \Sigma_{2K}\) we have

\[ (1 - \delta_{2K}) \|\bv\|^2_2 \leq \| \Phi \bv \|^2_2 \leq (1 + \delta_{2K}) \|\bv\|^2_2. \]

But then \(\bx - \by \in \Sigma_{2K}\) for every \(\bx, \by \in \Sigma_K\) and

\[ d^2 (\bx, \by) = \| \bx - \by \|_2^2 \]

and

\[ d^2 (\Phi \bx, \Phi \by) = \| \Phi \bx - \Phi \by \|_2^2 = \| \Phi (\bx - \by) \|_2^2. \]

Thus we have the result.

18.8.4. RIP with Unit Length Sparse Vectors#

Sometimes it is convenient to state RIP in terms of unit length sparse vectors.

Theorem 18.50 (RIP for unit length sparse vectors)

Let \(\bx\) be some arbitrary unit length (i.e., \(\| \bx \|_2 = 1\)) vector belonging to \(\Sigma_K\). A matrix \(\Phi\) is said to satisfy RIP of order \(K\) if and only if the following holds

(18.48)#\[(1 - \delta_K) \leq \| \Phi \bx \|^2_2 \leq (1 + \delta_K)\]

for every \(\bx \in \Sigma_K\) with \(\| \bx \|_2 = 1\).

Proof. If \(\Phi\) satisfies RIP of order \(K\) then by putting \(\|\bx \|_2 =1\) in (18.43) we get (18.48).

Now the converse.

  1. We assume (18.48) holds for all unit norm vectors \(\bx \in \Sigma_K\).

  2. We need to show that (18.43) holds for all \(\bx \in \Sigma_K\).

  3. For \(\bx = \bzero\) the bounds in (18.43) are trivially satisfied.

  4. Let \(\bx \in \Sigma_K\) be some nonzero vector.

  5. Let \(\widehat{\bx} = \frac{\bx} {\| \bx \|_2}\).

  6. Clearly \(\widehat{\bx}\) is unit length. Hence

    \[\begin{split} & (1 - \delta_K) \leq \| \Phi \widehat{\bx} \|^2_2 \leq (1 + \delta_K)\\ \implies & (1 - \delta_K) \leq \left \| \Phi \frac{\bx} {\| \bx \|_2} \right \|^2_2 \leq (1 + \delta_K)\\ \implies & (1 - \delta_K) \| \bx \|_2^2 \leq \| \Phi \bx\|^2_2 \leq (1 + \delta_K) \| \bx \|_2^2. \end{split}\]
  7. Thus \(\Phi\) satisfies RIP of order \(K\).

18.8.5. Singular and Eigen Values of \(K\)-Submatrices#

Consider any index set \(J \subset \{ 1, \dots, N\}\) with \(|J|=K\). Let \(\Phi_J\) be a sub matrix of \(\Phi\) consisting of columns indexed by \(J\). Assume \(K \leq M\). We define

\[ \bG \triangleq \Phi_J^H \Phi_J \in \CC^{K \times K} \]

as the Gram matrix for columns of \(\Phi_J\) (see Gram Matrices).

We consider the eigen values of \(\bG\) given by

\[ \bG \bx = \lambda \bx \]

for some \(\bx \in \CC^K\) and \(\bx \neq \bzero\). We will show that eigen values of \(\bG\) are bounded by RIP constant.

In the running example

\[\begin{split} \bG = \begin{bmatrix} 1 & 0 & 0.5\\ 0 & 1 & 0.5\\ 0.5 & 0.5 & 1 \end{bmatrix}. \end{split}\]

Eigen values of G are \((0.2929,1, 1.7071)\).

Theorem 18.51

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any sub matrix of \(\Phi\) with \(K\) columns. Then the eigen values of \(\bG = \Phi_J^H \Phi_J\) lie in the range \([1-\delta_K, 1 + \delta_K]\).

Proof. We note that \(\bG \in \CC^{K \times K}\).

  1. Let \(\lambda\) be some eigen value of \(\bG\).

  2. Let \(\bx \in \CC^K\) be a corresponding (nonzero) eigenvector.

  3. Then

    \[\begin{split} &\bG \bx = \lambda \bx \\ \implies & \bx^H \bG \bx = \bx^H \lambda \bx\\ \implies & \bx^H \Phi_J^H \Phi_J \bx = \lambda \| \bx \|_2^2\\ \implies &\| \Phi_J \bx \|^2_2 = \lambda \| \bx \|_2^2. \end{split}\]
  4. From (18.46) we recall that \(\delta_K\) RIP bounds apply for each vector in \(\bx \in \CC^K\) for a \(K\)-column submatrix \(\Phi_J\) given by

    \[ (1 - \delta_K) \| \bx\|^2_2 \leq \| \Phi_J \bx \|^2_2 \leq (1 + \delta_K) \| \bx\|^2_2. \]
  5. Thus

    \[\begin{split} & (1 - \delta_K) \| \bx\|^2_2 \leq \lambda \| \bx \|_2^2 \leq (1 + \delta_K) \| \bx\|^2_2\\ \implies &(1 - \delta_K) \leq \lambda \leq (1 + \delta_K) \end{split}\]

    since \(\bx \neq \bzero\).

Corollary 18.3

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(K\) columns. Then the Gram matrix \(\bG = \Phi_J^H \Phi_J\) is full rank and invertible. Moreover \(\bG\) is positive definite.

Proof. From Theorem 18.51, the eigen values are in range \([1-\delta_K, 1 + \delta_K]\).

  1. Since \(\Phi\) satisfies RIP of order \(K\), hence \(\delta_K < 1\).

  2. Hence all eigenvalues of \(\bG\) are positive.

  3. Hence \(\bG\) is positive definite.

  4. Hence their product is positive.

  5. Thus \(\det(\bG)\) is nonzero.

  6. Hence \(\bG\) is invertible.

Theorem 18.52

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(K\) columns. Then all singular values of \(\Phi_J\) are nonzero and they are in the range given by

\[ \sqrt{1-\delta_K} \leq \sigma \leq \sqrt{1 + \delta_K} \]

where \(\sigma\) is a singular value of \(\Phi_J\).

Proof. This is straight forward application of Lemma 4.74 and Theorem 18.51. Eigen values of \(\Phi_J^H \Phi_J\) are nothing but squares of the singular values of \(\Phi_J\).

Corollary 18.4

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(k\) columns where \(k \leq K\). Then the singular values of \(\Phi_J\) are nonzero and they are in the range given by

\[ \sqrt{1-\delta_K} \leq \sigma \leq \sqrt{1 + \delta_K} \]

where \(\sigma\) is a singular value of \(\Phi_J\).

Proof. Let \(\sigma\) be a singular value of \(\Phi_J\).

  1. Since \(\Phi\) satisfies RIP of order \(K\), it also satisfies RIP of order \(k \leq K\).

  2. From Theorem 18.52 we have

    \[ \sqrt{1-\delta_k} \leq \sigma \leq \sqrt{1 + \delta_k}. \]
  3. From Theorem 18.46 we have \(\delta_k \leq \delta_K\).

  4. Thus

    \[ 1 - \delta_K \leq 1 - \delta_k , \quad 1 + \delta_k \leq 1 + \delta_K. \]
  5. Thus

    \[ \sqrt{1-\delta_K} \leq \sigma \leq \sqrt{1 + \delta_K}. \]

Theorem 18.53

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(k\) columns where \(k \leq K\). Then the eigen values of \(\Phi_J^H \Phi_J + r \bI\) lie in the range

\[ [1-\delta_K + r, 1 + \delta_K + r]. \]

Moreover consider \(\Delta = \Phi_J^H \Phi_J - \bI\). Then

\[ \| \Delta \|_2 \leq \delta_K. \]

Proof. .

  1. From Theorem 18.51 eigen values of \(\Phi_J^H \Phi_J\) lie in the range \([1-\delta_K, 1 + \delta_K]\).

  2. From Lemma 4.69 \(\lambda\) is an eigen value of \(\Phi_J^H \Phi_J\) if and only if \(\lambda + r\) is an eigen value of \(\Phi_J^H \Phi_J + r \bI\).

  3. Hence the result.

Now for \(\Delta = \Phi_J^H \Phi_J - \bI\)

  1. The eigen values lie in the range \([-\delta_K, \delta_K]\).

  2. Thus for every eigen value of \(\Delta\) we have \(|\lambda| \leq \delta_K\).

  3. Since \(\Delta\) is Hermitian, its spectral norm is nothing but its largest eigen value.

  4. Hence

    \[ \| \Delta \|_2 \leq \delta_K. \]

From previous few results we see that bound over eigen values of \(\Phi_J^H \Phi_J\) given by \((1 - \delta_K) \leq \lambda \leq (1 + \delta_K)\) is a necessary condition for \(\Phi\) to satisfy RIP of order \(K\). We now show that this is also a sufficient condition.

Theorem 18.54

Let \(\Phi\) be an \(M \times N\) matrix with \(M \leq N\). Let \(J \subset \{ 1, \dots, N \}\) be any index set with \(|J| = K \leq M\). Let \(\Phi_J\) be the \(K\)-column sub-matrix of \(\Phi\) indexed by \(J\). Let \(\bG = \Phi_J^H \Phi_J\) be the Gram matrix of columns of \(\Phi_J\). Let the eigen values of \(\bG\) be \(\lambda\). If there exists a number \(\delta \in (0,1)\) such that

\[ 1 - \delta \leq \lambda \leq 1 + \delta \]

for every eigen value of \(\bG\) for every \(K\) column submatrix of \(\Phi\), then \(\Phi\) satisfies RIP of order \(K\).

Alternatively, let \(\Delta = \bG - \bI\). If

\[ \| \Delta \|_2 \leq \delta < 1 \]

for every \(K\) column submatrix of \(\Phi\) then \(\Phi\) satisfies RIP of order \(K\).

Alternatively, if singular values of \(\Phi_J\) satisfy

\[ \sqrt{1 - \delta} \leq \sigma \leq \sqrt{1 + \delta} \]

for every \(\Phi_J\) then \(\Phi\) satisfies RIP of order \(K\).

Proof. Equivalence of sufficient conditions

  1. We note that eigen values of \(\bG\) are related to eigen values of \(\Delta\) by the relation (see Lemma 4.69)

    \[ \lambda_G - 1 = \lambda_{\Delta} \iff \Lambda_G = 1 + \lambda_{\Delta}. \]
  2. Hence

    \[ \| \Delta \|_2 \leq \delta \iff - \delta \leq \lambda_{\Delta} \leq \delta \iff 1 - \delta \leq \lambda_G \leq 1 + \delta. \]
  3. Thus the first two sufficient conditions are equivalent.

  4. Lastly the eigen values of \(\bG\) are squares of singular values of \(\Phi_J\).

  5. Thus all sufficient conditions are equivalent.

Proof of sufficient condition

  1. Now let \(\bx \in \Sigma_K\) be an arbitrary vector.

  2. Let \(J = \supp(\bx)\).

  3. Clearly \(|J| \leq K\). If \(|J| < K\) then augment \(J\) by adding some indices arbitrarily till we get \(|J| = K\).

  4. Clearly \(\bx_J\) is an arbitrary vector in \(\CC^K\) and \(\Phi \bx = \Phi_J \bx_J\).

  5. Now let \(\lambda_1\) be the largest and \(\lambda_k\) be the smallest eigen value of \(\bG = \Phi_J^H \Phi_J\).

  6. \(\bG\) is Hermitian and all its eigen values are positive, hence it is positive definite.

  7. From Lemma 4.68 we get

    \[ \lambda_k \| \bx\|_2^2 \leq \bx^H \bG \bx \leq \lambda_1 \| \bx \|_2^2 \Forall \bx \in \CC^K. \]
  8. Applying the limits on the eigen values and using \(\bx^H \bG \bx = \|\Phi_J \bx\|_2^2\), we get

    \[ (1 - \delta) \| \bx\|_2^2 \leq \|\Phi_J \bx\|_2^2 \leq (1 + \delta) \|\bx\|_2^2 \Forall \bx \in \CC^K. \]
  9. Since this holds for every index set \(J\) with \(|J|=K\) hence an equivalent statement is

    \[ (1 - \delta) \| \bx\|_2^2 \leq \|\Phi \bx\|_2^2 \leq (1 + \delta) \| \bx\|_2^2 \Forall \bx \in \Sigma_K \subset \CC^N. \]
  10. Thus \(\Phi\) indeed satisfies RIP of order \(K\) with some \(\delta_K\) not larger than \(\delta\).

Theorem 18.55

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(k\) columns where \(k \leq K\). Let \(\Phi_J^{\dag}\) be its Moore-Penrose pseudo-inverse. Then the singular values of \(\Phi_J^{\dag}\) are nonzero and they are in the range given by

\[ \frac{1}{\sqrt{1+\delta_K}} \leq \sigma \leq \frac{1}{\sqrt{1 - \delta_K}} \]

where \(\sigma\) is a singular value of \(\Phi_J^{\dag}\).

Proof. Construction of pseudoinverse of a matrix through its singular value decomposition is discussed in Lemma 4.79.

  1. Lemma 4.80 shows that if \(\sigma\) is a nonzero singular value of \(\Phi_J^{\dag}\) then \(\frac{1}{\sigma}\) is a nonzero singular value of \(\Phi_J\).

  2. From Corollary 18.4 we have that if \(\frac{1}{\sigma}\) is a singular value of \(\Phi_J\) then,

    \[ \sqrt{1-\delta_K} \leq \frac{1}{\sigma} \leq \sqrt{1 + \delta_K} \]
  3. Inverting the terms in the inequalities we get our result.

Theorem 18.56

Eigen values of \(\bG = \Phi_J^H \Phi_J\) provide a lower bound on \(\delta_K\) given by

\[ \delta_K \geq \max (1 - \lambda_{\min}, \lambda_{\max} - 1) \]

where \(J\) is some index set choosing \(K\) columns of \(\Phi\) and \(\delta_K\) is the \(K\)-th restricted isometry constant for \(\Phi\).

In other words, singular values of \(\Phi_J\) provide a lower bound on \(\delta_K\) given by

\[ \delta_K \geq \max (1 - \sigma_{\min}^2, \sigma_{\max}^2 - 1) \]

Proof. Obvious.

In the running example, the bounds tell us that

\[ \delta_3 \geq 0.7071. \]

Certainly we have to consider all possible \(\binom{N}{K}\) sub-matrices \(\Phi_J\) to come up with an overall lower bound on \(\delta_K\).

This result doesn’t provide us any upper bound on \(\delta_K\).

Theorem 18.57

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(k\) columns where \(k \leq K\). Then

\[ \| \Phi_J \bx \|_2 \leq \sqrt{1 + \delta_K} \| \bx \|_2 \Forall \bx \in \CC^k. \]

Moreover

\[ \| \Phi_J^H \by \|_2 \leq \sqrt{1 + \delta_K} \| \by \|_2 \Forall \by \in \CC^M. \]

Proof. We note that \(\Phi_J\) is an \(M \times k\) matrix.

  1. Let \(\sigma_1\) be the largest singular value of \(\Phi_J\).

  2. Then by Lemma 4.77 we have

    \[ \| \Phi_J \bx \|_2 \leq \sigma_1 \| \bx \|_2 \Forall \bx \in \CC^k. \]

    and

    \[ \| \Phi_J^H \by \|_2 \leq \sigma_1 \| \by \|_2 \Forall \by \in \CC^M. \]
  3. From Theorem 18.52 and Corollary 18.4 we get

    \[ \sigma_1 \leq \sqrt{1 + \delta_K}. \]
  4. This completes the proof.

First inequality is a restatement of restricted isometry property in (18.46). Second inequality is interesting. In compressive sensing terms, \(\by\) is a measurement vector and we are using \(\Phi_J^H\) to project \(\by\) back into \(\CC^N\) over a \(k\) sparse support identified by \(J\). The inequality provides an upper bound on how much the length can increase during this operation.

Theorem 18.58

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(k\) columns where \(k \leq K\). Let \(\Phi_J^{\dag}\) be its Moore-Penrose pseudo-inverse. Then

\[ \| \Phi_J^{\dag} \by \|_2 \leq \frac{1}{\sqrt{1 - \delta_K}} \| \by \|_2 \Forall \by \in \CC^M. \]

Proof. We note that \(\Phi_J^{\dag}\) is an \(k \times M\) matrix.

  1. Let \(\sigma_1\) be the largest singular value of \(\Phi_J^{\dag}\).

  2. Then by Lemma 4.77 we have

    \[ \| \Phi_J^{\dag} \by \|_2 \leq \sigma_1 \| \by \|_2 \Forall y \in \CC^M. \]
  3. From Theorem 18.55 we see that singular values of \(\Phi_J^{\dag}\) satisfy the inequalities

    \[ \frac{1}{\sqrt{1+\delta_K}} \leq \sigma \leq \frac{1}{\sqrt{1 - \delta_K}}. \]
  4. Thus

    \[ \sigma_1 \leq \frac{1}{\sqrt{1 - \delta_K}}. \]
  5. Plugging it in we get

    \[ \| \Phi_J^{\dag} \by \|_2 \leq \frac{1}{\sqrt{1 - \delta_K}} \| \by \|_2 \Forall \by \in \CC^M. \]

In the previous theorem we saw that back-projection using \(\Phi_J^H\) had an upper bound on how much the length of measurement vector could increase. In this theorem we see another upper bound on how much the length of measurement vector can increase when back projected using the pseudo inverse of \(\Phi_J\).

Theorem 18.59

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any submatrix of \(\Phi\) with \(k\) columns where \(k \leq K\). Then

\[ (1 - \delta_K) \| \bx \|_2 \leq \| \Phi_J^H \Phi_J \bx \|_2 \leq (1 + \delta_K) \| \bx \|_2 \Forall \bx \in \CC^k. \]

Moreover

\[ \frac{1}{1 + \delta_K} \| \bx \|_2 \leq \| \left (\Phi_J^H \Phi_J \right)^{-1} \bx \|_2 \leq \frac{1}{1 - \delta_K} \| \bx \|_2 \Forall \bx \in \CC^k. \]

Proof. We note that \(\Phi_J\) is a full column rank tall matrix.

  1. We recall that all singular values of \(\Phi_J\) are positive and are bounded by (Corollary 18.4):

    \[ \sqrt{1-\delta_K} \leq \sigma_k \leq \dots \leq \sigma_1 \leq \sqrt{1 + \delta_K} \]

    where \(\sigma_1, \dots, \sigma_k\) are the singular values of \(\Phi_J\) (in descending order).

  2. We note that \(\Phi_J^H \Phi_J\) is an \(k \times k\) matrix which is invertible (Corollary 18.3).

  3. From Lemma 4.83 we get

    \[ \sigma_k^2 \| \bx \|_2 \leq \| \Phi_J^H \Phi_J \bx \|_2 \leq \sigma_1^2 \| \bx \|_2 \Forall \bx \in \CC^k. \]
  4. Applying the bounds on \(\sigma_i\) we get the result

    \[ (1 - \delta_K) \| \bx \|_2 \leq \| \Phi_J^H \Phi_J \bx \|_2 \leq (1 + \delta_K) \| \bx \|_2 \Forall \bx \in \CC^k. \]
  5. From Lemma 4.85 we have the bounds for \(\left ( \Phi_J^H \Phi_J \right) ^{-1}\) given by

    \[ \frac{1}{\sigma_1^2} \| \bx \|_2 \leq \| \left(\Phi_J^H \Phi_J \right)^{-1} \bx \|_2 \leq \frac{1}{\sigma_k^2} \| \bx \|_2 \Forall \bx \in \CC^k. \]
  6. Applying the bounds on \(\sigma_i\) we get the result

    \[ \frac{1}{1 + \delta_K} \| \bx \|_2 \leq \| \left (\Phi_J^H \Phi_J \right)^{-1} \bx \|_2 \leq \frac{1}{1 - \delta_K} \| \bx \|_2 \Forall \bx \in \CC^k. \]

In the sequel we will discuss that \(\Phi^H \Phi \bx\) can work as a very good proxy for the signal \(\bx\). The results in this theorem are very comforting in this regard.

Theorem 18.60

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(\Phi_J\) be any sub matrix of \(\Phi\) with \(k\) columns where \(k \leq K\). Then

\[ \| (\Phi_J^H \Phi_J - \bI ) \bx \|_2 \leq \delta_K \| \bx \|_2 \Forall \bx \in \CC^k. \]

Proof. .

  1. From Theorem 18.53 we get

    \[ \| \Phi_J^H \Phi_J - \bI \|_2 \leq \delta_k \leq \delta_K. \]
  2. Thus since spectral norm is subordinate

    \[ \| (\Phi_J^H \Phi_J - \bI ) \bx \|_2 \leq \|\Phi_J^H \Phi_J - \bI \|_2 \| \bx \|_2 \leq \delta_K \| \bx \|_2 \Forall \bx \in \CC^k. \]

18.8.6. Approximate Orthogonality#

We are going to show that disjoint sets of columns from \(\Phi\) span nearly orthogonal subspaces. This property is proved in [61].

Theorem 18.61

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(S\) and \(T\) denote index sets over the columns of \(\Phi\) with \(|S| + |T| \leq K\) and \(S \cap T = \EmptySet\). In other words, \(S\) and \(T\) are disjoint index sets. Let \(\Phi_S\) and \(\Phi_T\) denote corresponding sub-matrices consisting of columns indexed by \(S\) and \(T\) respectively. Then

\[ \| \Phi_S^H \Phi_T \|_2 \leq \delta_K \]

where \(\| \cdot \|_2\) denotes the \(2\)-norm or spectral norm of a matrix.

Proof. Define \(R = S \cup T\).

  1. Consider the sub-matrix \(\Phi_R\).

  2. Construct another matrix \(\Psi = \Phi_R^H \Phi_R - \bI\).

  3. The off-diagonal entries of \(\Psi\) are nothing but inner products of columns of \(\Phi_R\).

  4. We note that every entry in the matrix \(\Phi_S^H \Phi_T\) is an entry in \(\Psi\).

  5. Moreover, \(\Phi_S^H \Phi_T\) is a submatrix of \(\Psi\).

  6. The spectral norm of a sub-matrix is never greater than the spectral norm of the matrix containing it.

  7. Thus

    \[ \| \Phi_S^H \Phi_T \|_2 \leq \| \Phi_R^H \Phi_R - \bI \|_2. \]
  8. From Theorem 18.53 the eigen values of \(\Phi_R^H \Phi_R - \bI\) satisfy

    \[ 1-\delta_K - 1 \leq \lambda \leq 1 + \delta_K -1. \]
  9. Thus the spectral norm of \(\Phi_R^H \Phi_R - \bI\) which is its largest eigen value (see Theorem 4.145) satisfies

    \[ \| \Phi_R^H \Phi_R - \bI \|_2 \leq \delta_K. \]
  10. Plugging back we get

    \[ \| \Phi_S^H \Phi_T \|_2 \leq \delta_K. \]

This result has a useful corollary. It establishes the approximate orthogonality between a set of columns in \(\Phi\) and portion of a sparse vector not covered by those columns.

Corollary 18.5

Let \(\Phi\) satisfy the RIP of order \(K\) where \(K \leq M\). Let \(T \subset \{1, \dots, N \}\) be an index set and let \(\bx \in \CC^N\) be some vector. Let \(S = \supp(\bx)\). Further let us assume that \(K \geq | T \cup S |\). Define \(R = S \setminus T\).

Then the following holds

\[ \| \Phi_T^H \Phi \bx_R \|_2 \leq \delta_K \| \bx_R \|_2 \]

where \(\bx_R\) is obtained by keeping entries in \(\bx\) indexed by \(R\) while setting others to 0 (see Definition 18.11).

Proof. The set \(R\) denotes the indices at which \(\bx\) is nonzero but not yet covered in \(T\). In a typical sparse recovery algorithm, one has discovered a candidate support \(T\) which may not include all of \(S\).

  1. Since

    \[ \Phi \bx = \sum_{i=1}^N \phi_i x_i \]

    and \(\bx_R\) is zero on entries not indexed by \(R\), hence

    \[ \Phi \bx_R = \Phi_R \bx_R \]

    where on the R.H.S. \(\bx_R \in \CC^{|R|}\) by dropping the 0 entries from it not indexed by \(R\) (see Definition 18.11).

  2. Thus we have

    \[ \| \Phi_T^H \Phi \bx_R \|_2 = \| \Phi_T^H \Phi_R \bx_R \|_2. \]
  3. From Lemma 4.90 we know that any operator norm is subordinate.

  4. Thus

    \[ \| \Phi_T^H \Phi_R \bx_R \|_2 \leq \| \Phi_T^H \Phi_R \|_2 \| \bx_R \|_2. \]
  5. Since \(K \geq | T \cup S |\) hence we have

    \[ | R | = | S \setminus T | \leq K. \]
  6. Further \(T\) and \(R\) are disjoint and \(| T \cup R | \leq K\).

  7. Applying Theorem 18.61 we get

    \[ \| \Phi_T^H \Phi_R \|_2 \leq \delta_K. \]
  8. Putting back, we get our desired result

    \[ \| \Phi_T^H \Phi \bx_R \|_2 \leq \delta_K \| \bx_R \|_2. \]

18.8.7. Signal Proxy#

We can use the results so far to formalize the idea of signal proxy.

Theorem 18.62

Let \(\bx\) be a \(k\)-sparse signal Let \(\Phi\) satisfy the RIP of order \(k + l\) or higher. Let \(\bp\) be defined as

\[ \bp = (\Phi^H \Phi \bx)|_l \]

i.e. \(\bp\) is obtained by keeping the \(l\) largest entries in \(\bb = \Phi^H \Phi \bx\). Then the following holds

\[ \| \bp \|_2 \leq (1 + \delta_l + \delta_{k + l}) \| \bx \|_2. \]

Proof. Let \(A = \supp(\bx)\) and \(B = \supp(\bp)\).

  1. Then \(|A| \leq k\) and \(|B| \leq l\).

  2. Clearly

    \[ \bp = (\Phi^H \Phi \bx)|_l = (\Phi^H \Phi \bx)_B. \]
  3. From Theorem 18.20 we get

    \[ \bp = \Phi_B^H \Phi \bx. \]
  4. Let \(C = A \setminus B\).

  5. Since \(\bx\) is supported on \(A\) only, hence we can write

    \[ \bx = \bx_B + \bx_C. \]
  6. Thus from Corollary 18.1 we get (\(B\) and \(C\) are disjoint)

    \[ \Phi \bx = \Phi_B \bx_B + \Phi_C \bx_C. \]
  7. Thus we have

    \[ \bp = \Phi_B^H \Phi_B \bx_B + \Phi_B^H \Phi_C \bx_C. \]
  8. Using triangle inequality we can write

    \[ \| \bp \|_2 \leq \|\Phi_B^H \Phi_B \bx_B \|_2 + \| \Phi_B^H \Phi_C \bx_C\|_2. \]
  9. Theorem 18.59 gives us

    \[ \|\Phi_B^H \Phi_B \bx_B \|_2 \leq (1 + \delta_l) \|\bx_B \|_2. \]
  10. Since \(B\) and \(C\) are disjoint, hence Theorem 18.61 gives us

    \[ \| \Phi_B^H \Phi_C \bx_C\|_2 \leq \delta_{k +l} \| \bx_C \|_2. \]
  11. Since \(B\) and \(C\) are disjoint, hence

    \[ \| \bx_B \|_2 \leq \| \bx \|_2 \text{ and } \| \bx_C \|_2 \leq \| \bx \|_2. \]
  12. Finally

    \[ \| \bp \|_2 \leq (1 + \delta_l) \|\bx_B \|_2 + \delta_{k +l} \| \bx_C \|_2 \leq (1 + \delta_l + \delta_{k +l}) \| \bx \|_2. \]

18.8.8. RIP and Inner Product#

Let \(\bx\) and \(\bx'\) be two different vectors in \(\CC^N\) such that their support is disjoint. i.e. if

\[ T = \supp(\bx) \subseteq \{ 1 , \dots, N \} \]

and

\[ T' = \supp(\bx') \subseteq \{ 1, \dots, N \} \]

then \(T \cap T' = \EmptySet\).

Clearly

\[ \| \bx \|_0 = |T | \]

and

\[ \| \bx' \|_0 = | T' |. \]

Since the support of \(\bx\) and \(\bx'\) are disjoint hence it is straightforward that

\[ \langle \bx, \bx' \rangle = 0. \]

What can we say about the inner product of their corresponding embedded vectors \(\Phi \bx \) and \(\Phi \bx'\)?

Following theorem provides an upper bound on the magnitude of the inner product when the signal vectors \(\bx , \bx'\) belong to the Euclidean space \(\RR^N\). This result is adapted from [21].

Theorem 18.63

Assume that the sensing matrix \(\Phi \in \RR^{M \times N}\). For all \(\bx, \bx' \in \RR^N\) supported on disjoint subsets \(T, T' \subseteq \{1,\dots, N \}\) with \( |T| < k\) and \(|T| < k'\), we have

\[ | \langle \Phi \bx, \Phi \bx' \rangle | \leq \delta_{k + k'} \| \bx \|_2 \| \bx' \|_2 \]

where \(\delta_{k + k'}\) is the restricted isometry constant for the sparsity level \(k + k'\).

Proof. Let \(\widehat{\bx} = \frac{ \bx}{\| \bx \|_2}\) and \(\widehat{\bx'} = \frac{\bx'}{\| \bx' \|_2}\) be the corresponding unit norm vectors.

  1. Then

    \[ \langle \Phi \bx, \Phi \bx' \rangle = \langle \Phi \widehat{x}, \Phi \widehat{x'} \rangle \| \bx \|_2 \| \bx' \|_2. \]
  2. Hence if we prove the bound for unit norm vectors, then it will be straightforward to prove the bound for arbitrary vectors.

  3. Let us assume without loss of generality that \(\bx, \bx'\) are unit norm.

  4. We need to show that

    \[ | \langle \Phi \bx, \Phi \bx' \rangle | \leq \delta_{k + k'}. \]
  5. With the help of parallelogram identity (Theorem 4.93), we have

    \[ \langle \Phi \bx, \Phi \bx' \rangle = \frac{1}{4} \left ( \|\Phi \bx + \Phi \bx' \|_2^2 - \| \Phi \bx - \Phi \bx' \|_2^2 \right ). \]
  6. Thus

    \[ |\langle \Phi \bx, \Phi \bx' \rangle | = \frac{1}{4} \left | \|\Phi \bx + \Phi \bx' \|_2^2 - \| \Phi \bx - \Phi \bx' \|_2^2 \right |. \]
  7. Now

    \[ \| \bx \pm \bx' \|_2^2 = \| \bx\|_2^2 + \| \bx' \|_2^2 \pm 2 \langle \bx, \bx' \rangle = \| \bx\|_2^2 + \| \bx' \|_2^2 = 2 \]

    since \(\bx , \bx'\) are orthogonal and unit norm.

  8. Using Theorem 18.48 we have

    \[\begin{split} &(1 - \delta_{k + k'}) \| \bx \pm \bx' \|_2^2 \leq \| \Phi \bx \pm \Phi \bx' \|_2^2 \leq (1 + \delta_{k + k'}) \| \bx \pm \bx' \|_2^2\\ \implies & 2 (1 - \delta_{k + k'}) \leq \| \Phi \bx \pm \Phi \bx' \|_2^2 \leq 2 (1 + \delta_{k + k'}). \end{split}\]
  9. Hence the maximum value of \(\| \Phi \bx \pm \Phi \bx' \|_2^2\) can be \(2 (1 + \delta_{k + k'})\) while the minimum value of \(\| \Phi \bx \pm \Phi \bx' \|_2^2\) can be \(2 (1 - \delta_{k + k'})\).

  10. This gives us the upper bound

    \[ |\langle \Phi \bx, \Phi \bx' \rangle | \leq \frac{1}{4} \left ( 2 (1 + \delta_{k + k'}) - 2 (1 - \delta_{k + k'})\right ) = \delta_{k + k'}. \]
  11. Finally when \(\bx, \bx'\) are not unit norm, the bound generalizes to

    \[ |\langle \Phi \bx, \Phi \bx' \rangle | \leq \delta_{k + k'} \| \bx \|_2 \| \bx' \|_2. \]

A variation of this result is presented below:

Theorem 18.64

Assume that the sensing matrix \(\Phi \in \RR^{M \times N}\). Let \(\bu, \bv \in \RR^N\) be given and let

\[ K = \max( \| \bu + \bv \|_0 , \| \bu - \bv \|_0). \]

Let \(\Phi\) satisfy RIP of order \(K\) with the constant \(\delta_K\). Then

\[ | \langle \Phi \bu, \Phi \bv \rangle - \langle \bu, \bv \rangle | \leq \delta_{K} \| \bu \|_2 \| \bv \|_2. \]

This result is more general as it doesn’t require \(\bu, \bv\) to be supported on disjoint index sets. All it requires is them to be sufficiently sparse.

Proof. As, in the previous result, it is sufficient to prove it for the case where \(\| \bu \|_2 = \| \bv \|_2 = 1\). The simplified inequality becomes

\[ | \langle \Phi \bu, \Phi \bv \rangle - \langle \bu, \bv \rangle | \leq \delta_{K}. \]
  1. Clearly

    \[ \| \bu \pm \bv \|_2^2 = \| \bu \|_2^2 + \| \bv \|_2^2 \pm 2 \langle \bu , \bv \rangle = 2 \pm 2 \langle \bu , \bv \rangle. \]
  2. Due to RIP, we have

    \[ (1 - \delta_K) (2 \pm 2 \langle \bu , \bv \rangle) \leq \| \Phi (\bu \pm \bv ) \|_2^2 \leq (1 + \delta_K)(2 \pm 2 \langle \bu , \bv \rangle). \]
  3. From the parallelogram identity, we have

    (18.49)#\[ \langle \Phi \bu , \Phi \bv \rangle = \frac{1}{4} \left ( \| \Phi ( \bu + \bv) \|_2^2 - \| \Phi ( \bu - \bv) \|_2^2 \right ).\]
  4. Taking the upper bound on \(\| \Phi ( \bu + \bv) \|_2^2\) and the lower bound on \(\| \Phi ( \bu - \bv) \|_2^2\) in (18.49), we obtain

    \[ \langle \Phi \bu , \Phi \bv \rangle \leq \frac{1}{2} \left ((1 + \delta_K)(1 + \langle \bu , \bv \rangle) - (1 - \delta_K)(1 - \langle \bu , \bv \rangle) \right ). \]
  5. Simplifying, we get

    \[ \langle \Phi \bu , \Phi \bv \rangle \leq \langle \bu , \bv \rangle + \delta_K. \]
  6. At the same time, taking the lower bound on \(\| \Phi ( \bu + \bv) \|_2^2\) and the upper bound on \(\| \Phi ( \bu - \bv) \|_2^2\) in (18.49), we obtain

    \[ \langle \Phi \bu , \Phi \bv \rangle \geq \frac{1}{2} \left ((1 - \delta_K)(1 + \langle \bu , \bv \rangle) - (1 + \delta_K)(1 - \langle \bu , \bv \rangle) \right ). \]
  7. Simplifying, we get

    \[ \langle \Phi \bu , \Phi \bv \rangle \geq \langle \bu , \bv \rangle - \delta_K. \]
  8. Combining the two results, we obtain

    \[ | \langle \Phi \bu , \Phi \bv \rangle - \langle \bu , \bv \rangle | \leq \delta_K. \]

For the complex case, the result can be generalized if we choose a bilinear inner product rather than the usual sesquilinear inner product.

Theorem 18.65

Let \(\bu, \bv \in \CC^N\) be given and let

\[ K = \max( \| \bu + \bv \|_0 , \| \bu - \bv \|_0). \]

Let the complex space \(\CC^N\) be equipped with the bilinear inner product

\[ \langle \bu, \bv \rangle_B \triangleq \Re (\langle \bu, \bv \rangle) \]

i.e. the real part of the standard inner product.

Let \(\Phi\) satisfy RIP of order \(K\) with the constant \(\delta_K\). Then

\[ | \langle \Phi \bu, \Phi \bv \rangle_B - \langle \bu, \bv \rangle_B | \leq \delta_{K} \| \bu \|_2 \| \bv \|_2. \]

Proof. Recall that the norm induced by the bilinear inner product \(\langle \bu, \bv \rangle_B\) is the usual \(\ell_2\) norm since

\[ \langle \bu, \bu \rangle_B = \Re (\langle \bu, \bu \rangle) = \Re (\| \bu \|_2^2) =\| \bu \|_2^2. \]
  1. Let us just work out the parallelogram identity for the complex case

    \[\begin{split} \| \bx \pm \by \|_2^2 &= \langle \bx \pm \by , \bx \pm \by \rangle_B\\ &= \langle \bx, \bx \rangle_B + \langle \by, \by \rangle_B \pm \langle \bx, \by \rangle_B \pm \langle \by, \bx \rangle_B\\ &= \langle \bx, \bx \rangle_B + \langle \by, \by \rangle_B \pm 2 \langle \bx, \by \rangle_B \end{split}\]

    due to the bilinearity of the real inner product.

  2. We can see that the rest of the proof is identical to the proof of Theorem 18.64.

18.8.9. RIP and Orthogonal Projection#

The first result in this section is presented for real matrices. The generalization for complex matrices will be done later.

  1. Let \(\Lambda \subset \{1, \dots, N \}\) be an index set.

  2. Let \(\Phi \in \RR^{M \times N}\) satisfy RIP of order \(K\) with the restricted isometry constant \(\delta_K\).

  3. Assume that the columns of \(\Phi_{\Lambda}\) are linearly independent.

  4. We can define the pseudo inverse as

    \[ \Phi_{\Lambda}^{\dag} = \left (\Phi_{\Lambda}^H \Phi_{\Lambda} \right )^{-1} \Phi_{\Lambda}^H. \]
  5. The orthogonal projection operator to the column space for \(\Phi_{\Lambda}\) is given by

    \[ \bP_{\Lambda} = \Phi_{\Lambda}\Phi_{\Lambda}^{\dag}. \]
  6. The orthogonal projection operator onto the orthogonal complement of \(\ColSpace(\Phi_{\Lambda})\) (column space of \(\Phi_{\Lambda}\)) is given by

    \[ \bP_{\Lambda}^{\perp} = \bI - \bP_{\Lambda}. \]
  7. Both \(\bP_{\Lambda}\) and \(\bP_{\Lambda}^{\perp}\) satisfy the usual properties like \(\bP = \bP^H\) and \(\bP^2 = \bP\).

We further define

\[ \Psi_{\Lambda} = \bP_{\Lambda}^{\perp} \Phi. \]
  1. We are orthogonalizing the columns in \(\Phi\) against \(\ColSpace(\Phi_{\Lambda})\).

  2. In other words, keeping the component of the column which is orthogonal to the column space of \(\Phi_{\Lambda}\).

  3. Obviously the columns in \(\Psi_{\Lambda}\) corresponding to the index set \(\Lambda\) would be \(\bzero\).

We now present a result which shows that the matrix \(\Psi_{\Lambda}\) satisfies a modified version of RIP [25].

Theorem 18.66

If \(\Phi\) satisfies the RIP of order \(K\) with isometry constant \(\delta_K\), and \(\Lambda \subset \{1, \dots, N\}\) with \(|\Lambda | < K\), then the matrix \(\Psi_{\Lambda}\) satisfies the modified version of RIP as

(18.50)#\[\left ( 1 - \frac{\delta_K}{1 - \delta_K} \right ) \| \bx \|_2^2 \leq \| \Psi_{\Lambda} \bx \|_2^2 \leq (1 + \delta_K) \| \bx \|_2^2\]

for all \(\bx \in \RR^N\) such that \(\|\bx \|_0 \leq K - | \Lambda|\) and \(\supp(\bx) \cap \Lambda = \EmptySet\).

In words, if \(\Phi\) satisfies RIP of order \(K\), then \(\Psi_{\Lambda}\) acts as an approximate isometry on every \((K - |\Lambda|)\)-sparse vector supported on \(\Lambda^c\).

Proof. From the definition of \(\Psi_{\Lambda}\), we have

\[ \Psi_{\Lambda} \bx = (\bI - \bP_{\Lambda})\Phi \bx = \Phi \bx - \bP_{\Lambda} \Phi \bx. \]
  1. Alternatively

    \[ \Phi \bx = \Psi_{\Lambda} \bx + \bP_{\Lambda} \Phi \bx. \]
  2. Since \(\bP_{\Lambda}\) is an orthogonal projection, hence the vectors \(\bP_{\Lambda} \Phi \bx\) and \(\Psi_{\Lambda} \bx = \bP_{\Lambda}^{\perp} \Phi \bx\) are orthogonal.

  3. Thus, we can write

    (18.51)#\[ \| \Phi \bx \|_2^2 = \| \bP_{\Lambda} \Phi \bx \|_2^2 + \|\Psi_{\Lambda} \bx \|_2^2.\]
  4. We need to show that \(\| \Phi \bx \|_2 \approx \|\Psi_{\Lambda} \bx \|_2\) or alternatively that \(\| \bP_{\Lambda} \Phi \bx \|_2\) is small under the conditions of the theorem.

  5. Since \( P_{\Lambda} \Phi \bx \) is orthogonal to \(\Psi_{\Lambda} \bx\), hence

    (18.52)#\[\begin{split} \langle \bP_{\Lambda} \Phi \bx, \Phi \bx \rangle &= \langle \bP_{\Lambda} \Phi \bx, \Psi_{\Lambda} \bx + \bP_{\Lambda} \Phi \bx \rangle \\ &= \langle \bP_{\Lambda} \Phi \bx, \bP_{\Lambda} \Phi \bx \rangle + \langle \bP_{\Lambda} \Phi \bx, \Psi_{\Lambda} \bx \rangle\\ &= \langle \bP_{\Lambda} \Phi \bx, \bP_{\Lambda} \Phi \bx \rangle\\ &= \| \bP_{\Lambda} \Phi \bx \|_2^2.\end{split}\]
  6. Since \(\bP_{\Lambda}\) is a projection onto the \(\ColSpace(\Phi_{\Lambda})\) (column space of \(\Phi_{\Lambda}\)), there exists a vector \(\bz \in \CC^N\), such that \(P_{\Lambda} \Phi \bx = \Phi \bz\) and \(\supp(\bz) \subseteq \Lambda\).

  7. Since \(\supp(\bx) \cap \Lambda = \EmptySet\), hence \(\langle \bx, \bz \rangle = 0\).

  8. We also note that \(\| \bx + \bz \|_0 = \| \bx - \bz \|_0 \leq K\).

  9. Invoking Theorem 18.64, we have

    \[ | \langle \Phi \bz, \Phi \bx \rangle | \leq \delta_{K} \| \bz \|_2 \| \bx \|_2. \]
  10. Alternatively

    \[ | \langle \bP_{\Lambda} \Phi \bx, \Phi \bx \rangle | \leq \delta_{K} \| \bz \|_2 \| \bx \|_2. \]
  11. From RIP, we have

    \[ \sqrt{1 - \delta_K} \| \bz \|_2 \leq \| \Phi \bz \|_2 \]

    and

    \[ \sqrt{1 - \delta_K} \| \bx \|_2 \leq \| \Phi \bx \|_2. \]
  12. Thus

    \[ (1 - \delta_K)\| \bz \|_2 \| \bx \|_2 \leq \| \Phi \bz \|_2 \| \Phi \bx \|_2. \]
  13. This gives us

    \[ | \langle \bP_{\Lambda} \Phi \bx, \Phi \bx \rangle | \leq \frac{\delta_K}{1 - \delta_K} \| P_{\Lambda} \Phi \bx \|_2 \| \Phi \bx \|_2. \]
  14. Applying (18.52), we get

    \[ \| \bP_{\Lambda} \Phi \bx \|_2^2 \leq \frac{\delta_K}{1 - \delta_K} \| \bP_{\Lambda} \Phi \bx \|_2 \| \Phi \bx \|_2. \]
  15. Canceling the common term, we get

    \[ \| \bP_{\Lambda} \Phi \bx \|_2 \leq \frac{\delta_K}{1 - \delta_K} \| \Phi \bx \|_2. \]
  16. Trivially, we have \(\| \bP_{\Lambda} \Phi \bx \|_2 \geq 0\).

  17. Applying these bounds on (18.51), we obtain

    \[ \left ( 1 - \left ( \frac{\delta_K}{1 - \delta_K}\right )^2 \right ) \| \Phi \bx \|_2^2 \leq \|\Psi_{\Lambda} \bx \|_2^2 \leq \| \Phi \bx \|_2^2. \]
  18. Finally, using the RIP again with

    \[ (1 - \delta_K) \| \bx \|_2^2 \leq \|\Phi \bx \|_2^2 \leq (1 + \delta_K) \| \bx \|_2^2 \]

    we obtain

    \[ \left ( 1 - \left ( \frac{\delta_K}{1 - \delta_K}\right )^2 \right ) (1 - \delta_K) \| \bx \|_2^2 \leq \|\Psi_{\Lambda} \bx \|_2^2 \leq (1 + \delta_K) \| \bx \|_2^2 . \]
  19. Simplifying

    \[\begin{split} \left ( 1 - \left ( \frac{\delta_K}{1 - \delta_K}\right )^2 \right ) (1 - \delta_K) &= \frac{1 + \delta_K^2 - 2 \delta_K - \delta_K^2}{1 - \delta_K}\\ &= \frac{1 - 2 \delta_K }{1 - \delta_K} \\ &= 1 - \frac{\delta_K}{1 - \delta_K}. \end{split}\]
  20. Thus, we get the intended result in (18.50).

18.8.10. RIP for Higher Orders#

If \(\Phi\) satisfies RIP of order \(K\), does it satisfy RIP of some other order \(K' > K\)? There are some results available to answer this question.

Theorem 18.67

Let \(c\) and \(k\) be integers and let \(\Phi\) satisfy RIP of order \(2 k\). \(\Phi\) satisfies RIP of order \(c k\) with a restricted isometry constant

\[ \delta_{ck} \leq c \delta_{2 k} \]

if \(c \delta_{2 k} < 1\).

Note that this is only a sufficient condition. Thus if \(c \delta_{2 k} \geq 1\) we are not claiming whether \(\Phi\) satisfies RIP of order \(ck\) or not.

Proof. For \(c=1\), \(\delta_k \leq \delta_{2 k}\). For \(c=2\), \(\delta_{2 k} \leq 2 \delta_{2 k}\). These two cases are trivial. We now consider the case for \(c \geq 3\).

  1. Let \(S\) be an arbitrary index set of size \(c k\). Let

    \[ \Delta = \Phi_S^H \Phi_S - \bI. \]
  2. From Theorem 18.54, a sufficient condition for \(\Phi\) to satisfy RIP of order \(c k\) is that

    \[ \| \Delta \|_2 < 1 \]

    for all index sets \(S\) with \(|S|= c k\).

  3. Thus if we can show that

    \[ \| \Delta \|_2 \leq c \delta_{2 k} \]

    we would have shown that \(\Phi\) satisfies RIP of order \(c k\).

  4. We note that \(\Phi_S\) is of size \(M \times c k\).

  5. Thus \(\Delta\) is of size \(c k \times c k\).

  6. We partition \(\Delta\) into a block matrix of size \(c \times c\)

    \[\begin{split} \Delta = \begin{bmatrix} \Delta_{11} & \Delta_{12} & \dots & \Delta_{1 c}\\ \Delta_{21} & \Delta_{22} & \dots & \Delta_{2 c}\\ \vdots & \vdots & \ddots & \vdots\\ \Delta_{c 1} & \Delta_{c 2} & \dots & \Delta_{c c}\\ \end{bmatrix} \end{split}\]

    where each entry \(\Delta_{i j}\) is a square matrix of size \(k \times k\).

  7. Each diagonal matrix \(\Delta_{i i}\) corresponds to some \(\Phi_T^H \Phi_T - \bI\) where \(|T| = k\).

  8. Thus we have (see Theorem 18.53)

    \[ \| \Delta_{i i} \|_2 \leq \delta_k. \]
  9. The off-diagonal matrices \(\Delta_{i j}\) are

    \[ \Delta_{i j} = \Phi_P^H \Phi_Q \]

    where \(P\) and \(Q\) are disjoint index sets with \(|P| = |Q| = k\) with \( | P \cup Q | = 2 k\).

  10. Thus from the approximate orthogonality condition (Theorem 18.61 ) we have

    \[ \| \Delta_{i j} \|_2 \leq \delta_{2 k}. \]
  11. Finally we apply Gershgorin circle theorem for block matrices (Corollary 4.29).

  12. This gives us

    \[ | \| \Delta \|_2 - \|\Delta_{ii}\|_2| \leq \sum_{j, j\neq i} \|\Delta_{i j} \| \text{ for some } i \in \{1,2, \dots, c \}. \]
  13. Thus we have

    \[\begin{split} & | \| \Delta \|_2 - \delta_k | \leq \sum_{j, j\neq i} \delta_{2 k} \\ \implies & | \| \Delta \|_2 - \delta_k | \leq (c - 1) \delta_{2 k} \\ \implies & \| \Delta \|_2 \leq \delta_k + (c - 1) \delta_{2 k} \\ \implies & \| \Delta \|_2 \leq \delta_{2 k} + (c - 1) \delta_{2 k} \\ \implies & \| \Delta \|_2 \leq c \delta_{2 k}. \end{split}\]
  14. We have shown that \(\| \Delta \|_2 \leq c \delta_{2 k} < 1\).

  15. Thus \(\delta_{c k} \leq \| \Delta \|_2\).

  16. Hence \(\Phi\) indeed satisfies RIP of order \(c k\).

This theorem helps us extend RIP from an order \(K\) to higher orders. If \(\delta_{2 k}\) isn’t sufficiently small, the bound isn’t useful.

18.8.11. Embeddings of Arbitrary Signals#

So far we have considered only sparse signals while analyzing the embedding properties of a RIP satisfying matrix \(\Phi\). In this subsection we wish to explore bounds on the \(\ell_2\) norm of an arbitrary signal when embedded by \(\Phi\). This result is adapted from [61].

Theorem 18.68

Let \(\Phi\) be an an \(M \times N\) matrix satisfying

(18.53)#\[\| \Phi \bx \|_2 \leq \sqrt{1 + \delta_K} \| \bx \|_2 \Forall \bx \in \Sigma_K.\]

Then for every signal \(\bx \in \CC^N\), the following holds:

\[ \| \Phi \bx \|_2 \leq \sqrt{ 1 + \delta_K} \left [ \| \bx \|_2 + \frac{1}{\sqrt{K}} \| \bx \|_1 \right ]. \]

We note that the theorem requires \(\Phi\) to satisfy only the upper bound of RIP property (18.43). The proof is slightly involved.

Proof. We note that the bound is trivially true for \(\bx = \bzero\). Hence in the following we will consider only for \(\bx \neq \bzero\).

  1. Consider an arbitrary index set \(\Lambda \subset \{ 1, 2, \dots, N \}\) such that \(| \Lambda | \leq K\).

  2. Consider the unit ball in the Banach space \(\ell_2(\Lambda)\) given by

    (18.54)#\[B_2^{\Lambda} = \{ \bx \in \CC^N \ST \supp(\bx) = \Lambda \text{ and } \| \bx \|_2 \leq 1 \}\]

    i.e. the set of all signals whose support is \(\Lambda\) and whose \(\ell_2\) norm is less than or equal to 1.

  3. Now define a convex body

    (18.55)#\[S = \ConvexHull \left \{ \bigcup_{| \Lambda | \leq K} B_2^{\Lambda} \right \} \subset \CC^N.\]
  4. We recall from Definition 9.9 that if \(\bx\) and \(\by\) belong to \(S\) then their convex combination \(\theta \bx + (1 - \theta) \by\) with \(\theta \in [0,1]\) must lie in \(S\).

  5. Further it can be verified that \(S\) is a compact convex set with non-empty interior.

  6. Hence it is a convex body.

  7. Consider any \(\bx \in B_2^{\Lambda_1}\) and \(\by \in B_2^{\Lambda_2}\).

  8. From (18.53) and (18.54) we have

    \[ \| \Phi \bx \|_2 \leq \sqrt{1 + \delta_K} \| \bx \|_2 \leq \sqrt{1 + \delta_K} \]

    and

    \[ \| \Phi \by \|_2 \leq \sqrt{1 + \delta_K} \| \by \|_2 \leq \sqrt{1 + \delta_K}. \]
  9. Now let

    \[ \bz = \theta \bx + (1 - \theta ) \by \text{ where } \theta \in [0, 1]. \]
  10. Then

    \[ \| \bz \|_2 = \| \theta \bx + (1 - \theta ) \by \|_2 \leq \theta \| \bx \|_2 + (1 - \theta ) \| \by \|_2 \leq \theta + (1 - \theta) = 1. \]
  11. Further

    \[\begin{split} \| \Phi \bz \|_2 &= \| \Phi ( \theta \bx + (1 - \theta ) \by) \|_2 \\ &\leq \| \Phi \theta \bx\|_2 + \| \Phi(1 - \theta ) \by \|_2\\ &= \theta \| \Phi \bx \|_2 + (1 - \theta) \| \Phi \by \|_2 \\ &\leq \theta \sqrt{1 + \delta_K} + (1 - \theta) \sqrt{1 + \delta_K}\\ &\leq \sqrt{1 + \delta_K}. \end{split}\]
  12. Similarly, it can be shown that for every vector \(\bx \in S\) we have \(\| \bx\|_2 \leq 1\) and \(\| \Phi \bx \|_2 \leq \sqrt{1 + \delta_K}\).

    1. Let \(\bx \in S\).

    2. Then \(\bx = \sum_{i=1}^r t_i \bx_i\) such that \(\bx_i \in B_2^{\Lambda_i}\) where \(|\Lambda_i| \leq K\), \(t_i \geq 0\) and \(\sum t_i = 1\).

    3. Hence \( \| \bx_i \|_2 \leq 1\) for every \(i\).

    4. Hence

      \[ \| \bx \|_2 \leq \sum_{i=1}^r t_i \| \bx_i \|_2 \leq \sum_{i=1}^r t_i = 1. \]
    5. Similarly \(\| \Phi \bx_i \|_2 \leq \sqrt{1 + \delta_K}\).

    6. Hence

      \[ \| \Phi \bx \|_2 \leq \sum_{i=1}^r t_i \| \Phi \bx_i \|_2 \leq \sqrt{1 + \delta_K}. \]
  13. We now define another convex body

    (18.56)#\[\Gamma = \left \{ \bx \ST \| \bx \|_2 + \frac{1}{\sqrt{K}} \| \bx \|_1 \leq 1 \right \} \subset \CC^N.\]
  14. We quickly verify the convexity property.

    1. Let \(\bx, \by \in \Gamma\).

    2. Let

      \[ \bz = \theta \bx + (1 - \theta) \by \quad \text{ where } \theta \in [0,1]. \]
    3. Then

      \[\begin{split} & \| \bz \| + \frac{1}{\sqrt{K}} \| \bz \|_1 \\ & = \| \theta \bx + (1 - \theta) \by \|_2 + \frac{1}{\sqrt{K}} \| \theta \bx + (1 - \theta) \by \|_1\\ & \leq \theta \| \bx \|_2 + (1 - \theta) \| \by \|_2 + \frac{\theta}{\sqrt{K}} \| \bx \|_1 + \frac{(1 - \theta)}{\sqrt{K}} \| \by \|_1 \\ & = \theta \left [ \| \bx \|_2 + \frac{1}{\sqrt{K}} \| \bx \|_1 \right ] + (1 - \theta) \left [\| \by \|_2 + \frac{1}{\sqrt{K}} \| \by \|_1 \right ] \\ & \leq \theta + (1 - \theta) = 1. \end{split}\]
    4. Thus \(\bz \in \Gamma\).

    5. This analysis shows that all convex combinations of elements in \(\Gamma\) belong to \(\Gamma\).

    6. Thus \(\Gamma\) is convex.

    7. Further it can be verified that \(\Gamma\) is a compact convex set with non-empty interior.

    8. Hence it is a convex body.

  15. For any \(\bx \in \CC^N\) one can find a \(\by \in \Gamma\) by simply applying an appropriate nonzero scale \(\by = c \bx\) where the scale factor \(c\) depends on \(\bx\).

  16. For a moment suppose that \(\Gamma \subset S\).

  17. Then if \(\by \in \Gamma\) the following are true:

    \[ \| \by \|_2 + \frac{1}{\sqrt{K}} \| \by \|_1 \leq 1 \]

    and

    \[ \| \Phi \by \|_2 \leq \sqrt{1 + \delta_K}. \]
  18. Now consider an arbitrary nonzero \(\bx \in \CC^N\).

  19. Let

    \[ \alpha = \| \bx \|_2 + \frac{1}{\sqrt{K}} \| \bx \|_1. \]
  20. Define

    \[ \by = \frac{1}{\alpha} \bx. \]
  21. Then

    \[ \|\by \|_2 + \frac{1}{\sqrt{K}} \| \by \|_1 = \frac{1}{\alpha} \left ( \| \bx \|_2 + \frac{1}{\sqrt{K}} \| \bx \|_1\right ) = 1. \]
  22. Thus \(\by \in \Gamma\) and

    \[\begin{split} & \| \Phi \by \|_2 \leq \sqrt{1 + \delta_K}\\ \implies & \left \| \Phi \frac{1}{\alpha} \bx \right \|_2 \leq \sqrt{1 + \delta_K}\\ \implies & \| \Phi \bx \|_2 \leq \sqrt{1 + \delta_K} \alpha \\ \implies & \| \Phi \bx \|_2 \leq \sqrt{1 + \delta_K} \left ( \| \bx \|_2 + \frac{1}{\sqrt{K}} \| \bx \|_1 \right ) \Forall \bx \in \CC^N \end{split}\]

    which is our intended result.

  23. Hence if we show that \(\Gamma \subset S\) holds, we would have proven our theorem.

We will achieve this by showing that every vector \(\bx \in \Gamma\) can be shown to be a convex combination of vectors in \(S\).

  1. We start with an arbitrary \(\bx \in \Gamma\).

  2. Let \(I = \supp(\bx)\).

  3. We partition \(I\) into disjoint sets of size \(K\).

  4. Let there be \(J+1\) such sets given by

    \[ I = \bigcup_{j = 0}^J I_j. \]
  5. Let \(I_0\) index the \(K\) largest entries in \(\bx\) (magnitude wise).

  6. Let \(I_1\) be next \(K\) largest entries and so on.

  7. Since \(|I|\) may not be a multiple of \(K\), hence the last index set \(I_J\) may not have \(K\) indices.

  8. We define

    \[\begin{split} \bx_{I_j}(i) = \left\{ \begin{array}{ll} x(i) & \mbox{if $i \in I_j$};\\ 0 & \mbox{otherwise}. \end{array} \right. \end{split}\]
  9. Thus we can write

    \[ \bx = \sum_{j = 0}^J \bx_{I_j}. \]
  10. Now let

    \[ \theta_j = \| \bx_{I_j} \|_2 \; \text{ and } \; \by_j = \frac{1}{\theta_j} \bx_{I_j}. \]
  11. We can write

    \[ \bx = \sum_{j = 0}^J \theta_j \by_j. \]
  12. In this construction of \(\bx\) we can see that \(1 \geq \theta_0 \geq \theta_1 \geq \dots \geq \theta_J \geq 0\).

  13. Also \(\by_j \in S\) since \(\by_j\) is a unit norm \(K\) sparse vector (18.55).

  14. We will show that \(\sum_j \theta_j \leq 1\) in a short while.

  15. This will imply that \(\bx\) is a convex combination of vectors from \(S\).

  16. But since \(S\) is convex hence \(\bx \in S\).

  17. This will imply that \(\Gamma \subset S\).

  18. The proof will be complete.

We now show that \(\sum_j \theta_j \leq 1\).

  1. Pick any \(j \in \{1, \dots, J \}\).

  2. Since \(\bx_{I_j}\) is \(K\)-sparse hence due to Theorem 18.17 we have

    \[ \theta_j = \| \bx_{I_j} \|_2 \leq \sqrt{K} \| \bx_{I_j} \|_{\infty}. \]
  3. It is easy to see that \(I_{j-1}\) identifies exactly \(K\) nonzero entries in \(\bx\) and each of nonzero entries in \(\bx_{I_{j -1}}\) is larger than the largest entry in \(\bx_{I_j}\) (magnitude wise).

  4. Thus we have

    \[ \| \bx_{I_{j-1}} \|_1 = \sum_{ i \in I_{j-1}} | x_i | \geq \sum_{ i \in I_{j-1}} \| \bx_{I_j} \|_{\infty} = K \| \bx_{I_j} \|_{\infty}. \]
  5. Thus

    \[ \| \bx_{I_j} \|_{\infty} \leq \frac{1}{K} \| \bx_{I_{j-1}} \|_1. \]
  6. Combining the two inequalities we get

    \[ \theta_j \leq \frac{1}{\sqrt{K}} \| \bx_{I_{j -1}} \|_1. \]
  7. This lets us write

    \[ \sum_{j=1}^{J}\theta_j \leq \sum_{j=1}^{J}\frac{1}{\sqrt{K}} \| \bx_{I_{j -1}} \|_1 \leq \frac{1}{\sqrt{K}} \| \bx \|_1 \]

    since

    \[ \| \bx \|_1 = \sum_{j = 0}^J \| \bx_{I_j} \|_1 \geq \sum_{j = 1}^J \| \bx_{I_{j-1}} \|_1. \]
  8. Finally

    \[ \theta_0 = \| \bx_{I_0} \|_2 \leq \| \bx \|_2. \]
  9. This gives us the inequality

    \[ \sum_{j = 0}^J \theta_j \leq \| \bx \|_2 + \frac{1}{\sqrt{K}} \| \bx \|_1 \leq 1 \]

    since \(\bx \in \Gamma\).

  10. Recalling our steps we can express \(\bx\) as

    \[ \bx = \theta_j \by_j \]

    where \(\by_j \in S\) and \(\sum \theta_j \leq 1\) implies that \(\bx \in S\) since \(S\) is convex.

  11. Thus \(\Gamma \subset S\).

  12. This completes the proof.

18.8.12. A General Form of RIP#

A more general restricted isometry bound can be for an arbitrary matrix \(\Phi\) can be as follows

\[ \alpha \| \bx \|^2_2 \leq \| \Phi \bx \|^2_2 \leq \beta \| \bx \|^2_2 \]

where \(0 < \alpha \leq \beta < \infty\).

It is straightforward to scale \(\Phi\) to match the bounds in (18.43).

Let \(\delta_K = \frac{\beta - \alpha}{\alpha + \beta}\). Then \(1 - \delta_K = \frac{2\alpha}{\alpha + \beta}\) and \(1 + \delta_K = \frac{2\beta}{\alpha + \beta}\).

Putting in (18.43) we get

\[\begin{split} & \frac{2\alpha}{\alpha + \beta} \| \bx \|^2_2 \leq \| \Phi \bx \|^2_2 \leq \frac{2\beta}{\alpha + \beta} \| \bx \|^2_2 \\ \implies & \alpha \| \bx \|^2_2 \leq \| \sqrt{\frac{\alpha + \beta}{2}} \Phi \bx \|^2_2 \leq \beta \| \bx \|^2_2. \end{split}\]

Thus by multiplying \(\Phi\) with \(\sqrt{2/(\alpha + \beta)}\) we can transform the more general bound to the form of (18.43).

18.8.13. Finding out RIP Constants#

The optimal value of RIP constant of \(K\)-th order \(\delta_K\) can be obtained by solving the following optimization problem.

Algorithm 18.2

\[\begin{split} & \underset{0 < \delta < 1}{\text{minimize}} & & \delta\\ & \text{subject to } & & (1 - \delta) \|\bx\|^2_2 \leq \| \Phi \bx \|^2_2 \leq (1 + \delta) \| \bx\|^2_2 \Forall \bx \in \Sigma_K. \end{split}\]

This problem isn’t easy to solve. In fact it has been shown in [4] that this problem is NP-hard.

18.8.14. RIP and Coherence#

Here we establish a relationship between the RIP constants and coherence of a dictionary.

Rather than a general matrix \(\Phi\), we restrict our attention to a dictionary \(\bDDD \in \CC^{N \times D}\) We assume that the dictionary is overcomplete \((D > N)\) and full rank \(\Rank(\bDDD) = N\). Dictionary is assumed to satisfy RIP of some order.

Theorem 18.69 (Coherence upper bound for RIP constant)

Let \(\bDDD\) satisfy RIP of order \(K\). Then

\[ \delta_K \leq (K - 1) \mu (\bDDD). \]

Proof. We recall that \(\delta_K\) is the smallest constant \(\delta\) satisfying

\[ (1 - \delta) \| \bx\|^2_2 \leq \| \bDDD \bx \|^2_2 \leq (1 + \delta) \| \bx\|^2_2 \; \Forall \bx \in \Sigma_K. \]
  1. Let \(\Lambda\) be any index set with \(| \Lambda | = K\).

  2. Then

    \[ \| \bDDD \bx \|^2_2 = \| \bDDD_{\Lambda} \bx_{\Lambda} \|^2_2 \Forall \bx \in \CC^{\Lambda}. \]
  3. Since \(\bDDD\) satisfies RIP of order \(K\), hence \(\bDDD_{\Lambda}\) is a subdictionary (its columns are linearly independent).

  4. Recall from Theorem 18.30 that

    \[ (1 - (K - 1) \mu) \| \bv \|_2^2 \leq \| \bDDD_{\Lambda} \bv \|_2^2 \leq (1 + (K - 1) \mu)\| \bv \|_2^2 \]

    holds true for every \(\bv \in \CC^K\).

  5. Since \(\delta_K\) is smallest possible constant, hence

    \[ 1 + \delta_K \leq 1 + (K - 1) \mu \implies \delta_K \leq (K - 1) \mu (\bDDD). \]