4.12. Matrix Norms#

This section reviews various matrix norms on the vector space of complex matrices over the field of complex numbers \((\CC^{m \times n}, \CC)\).

We know \((\CC^{m \times n}, \CC)\) is a finite dimensional vector space with dimension \(m n\). We will usually refer to it as \(\CC^{m \times n}\).

Matrix norms will follow the usual definition of norms for a vector space.

Definition 4.139 (Matrix norm)

A function \(\| \cdot \| : \CC^{m \times n} \to \RR\) is called a matrix norm on \(\CC^{m \times n}\) if for all \(\bA, \bB \in \CC^{m \times n}\) and all \(\alpha \in \CC\) it satisfies the following

  • [Positivity]

    \[ \| \bA \| \geq 0 \]

    with \(\| \bA \| = 0 \iff \bA = \ZERO\).

  • [Homogeneity]

    \[ \| \alpha \bA \| = | \alpha | \| \bA \|. \]
  • [Triangle inequality]

    \[ \| \bA + \bB \| \leq \| \bA \| + \| \bB \|. \]

We recall some of the standard results on finite dimensional normed vector spaces.

All matrix norms are equivalent. Let \(\| \cdot \|\) and \(\| \cdot \|'\) be two different matrix norms on \(\CC^{m \times n}\). Then there exist two constants \(a\) and \(b\) such that the following holds

\[ a \| \bA \| \leq \| \bA \|' \leq b \|A \| \Forall \bA \in \CC^{m \times n}. \]

A matrix norm is a continuous function \(\| \cdot \| : \CC^{m \times n} \to \RR\).

4.12.1. Norms like \(\ell_p\) on Complex Vector Space#

Following norms are quite like \(\ell_p\) norms on finite dimensional complex vector space \(\CC^n\). They are developed by the fact that the matrix vector space \(\CC^{m\times n}\) has one to one correspondence with the complex vector space \(\CC^{m n}\).

Definition 4.140 (Sum norm)

Let \(\bA \in \CC^{m\times n}\) and \(\bA = [a_{i j}]\).

Matrix sum norm is defined as

\[ \| \bA \|_S = \sum_{i=1}^{m} \sum_{j=1}^n | a_{i j} | \]

Definition 4.141 (Frobenius norm)

Let \(\bA \in \CC^{m\times n}\) and \(\bA = [a_{i j}]\).

Matrix Frobenius norm is defined as

\[ \| \bA \|_F = \left ( \sum_{i=1}^{m} \sum_{j=1}^n | a_{i j} |^2 \right )^{\frac{1}{2}}. \]

Definition 4.142 (Max norm)

Let \(\bA \in \CC^{m\times n}\) and \(\bA = [a_{i j}]\).

Matrix Max norm is defined as

\[\begin{split} \| \bA \|_M = \underset{\substack{ 1 \leq i \leq m \\ 1 \leq j \leq n}}{\max} | a_{i j} |. \end{split}\]

4.12.2. Properties of Frobenius Norm#

We now prove some elementary properties of Frobenius norm.

Lemma 4.86

The Frobenius norm of a matrix is equal to the Frobenius norm of its Hermitian transpose.

\[ \| \bA^H \|_F = \| \bA \|_F. \]

Proof. This is valid since the norm involves only the absolute values of entries in the matrix.

  1. Let

    \[ \bA = [a_{i j}]. \]
  2. Then

    \[ \bA^H = [\overline{a_{j i}}] \]
  3. Then

    \[\begin{split} \| \bA^H \|_F^2 = \left ( \sum_{j=1}^n \sum_{i=1}^{m} | \overline{a_{i j}} |^2 \right ) = \left ( \sum_{i=1}^{m} \\ \sum_{j=1}^n | a_{i j} |^2 \right ) = \| \bA \|_F^2. \end{split}\]
  4. Now

    \[ \| \bA^H \|_F^2 = \| \bA \|_F^2 \implies \| \bA^H \|_F = \| \bA \|_F. \]

Lemma 4.87 (Expansion in squared norms of columns)

Let \(\bA \in \CC^{m \times n}\) be written as a row of column vectors

\[ \bA = \begin{bmatrix} \ba_1 & \dots & \ba_n \end{bmatrix}. \]

Then

\[ \| \bA \|_F^2 = \sum_{j=1}^{n} \| \ba_j \|_2^2. \]

Proof. We note that

\[ \| \ba_j \|_2^2 = \sum_{i=1}^m \| a_{i j} \|_2^2. \]

Now

\[ \| \bA \|_F^2 = \left ( \sum_{i=1}^{m} \sum_{j=1}^n | a_{i j} |^2 \right ) = \left ( \sum_{j=1}^n \left ( \sum_{i=1}^{m} | a_{i j} |^2 \right ) \right ) = \left (\sum_{j=1}^n \| \ba_j \|_2^2 \right). \]

We thus showed that that the square of the Frobenius norm of a matrix is nothing but the sum of squares of \(\ell_2\) norms of its columns.

Lemma 4.88

Let \(\bA \in \CC^{m \times n}\) be written as a column of row vectors

\[\begin{split} \bA = \begin{bmatrix} \underline{\ba}^1 \\ \vdots \\ \underline{\ba}^m \end{bmatrix}. \end{split}\]

Then

\[ \| \bA \|_F^2 = \sum_{i=1}^{m} \| \underline{\ba}^i \|_2^2. \]

Proof. We note that

\[ \| \underline{\ba}^i \|_2^2 = \sum_{j=1}^n \| a_{i j} \|_2^2. \]

Now

\[ \| \bA \|_F^2 = \left ( \sum_{i=1}^{m} \sum_{j=1}^n | a_{i j} |^2 \right ) = \sum_{i=1}^{m} \| \underline{\ba}^i \|_2^2. \]

We now consider how the Frobenius norm is affected with the action of unitary matrices.

  1. Let \(\bA\) be any arbitrary matrix in \(\CC^{m \times n}\).

  2. Let \(\bU\) be some unitary matrices in \(\CC^{m \times m}\).

  3. Let \(\bV\) be some unitary matrices in \(\CC^{n \times n}\).

We present our first result that multiplication with unitary matrices doesn’t change Frobenius norm of a matrix.

Theorem 4.141

The Frobenius norm of a matrix is invariant to pre or post multiplication by a unitary matrix. i.e.

\[ \| \bU \bA \|_F = \| \bA \|_F \]

and

\[ \| \bA \bV \|_F = \| \bA \|_F. \]

Proof. We can write \(\bA\) as

\[ \bA = \begin{bmatrix} \ba_1 & \dots & \ba_n \end{bmatrix}. \]

So

\[ \bU \bA = \begin{bmatrix} \bU \ba_1 & \dots & \bU \ba_n \end{bmatrix}. \]

Then applying Lemma 4.87 clearly

\[ \| \bU \bA \|_F^2 = \sum_{j=1}^{n} \|\bU \ba_j \|_2^2. \]

But we know that unitary matrices are norm preserving. Hence

\[ \|\bU \ba_j \|_2^2 = \| \ba_j \|_2^2. \]

Thus

\[ \| \bU \bA \|_F^2 = \sum_{j=1}^{n} \|\ba_j \|_2^2 = \| \bA \|_F^2 \]

which implies

\[ \| \bU \bA \|_F = \| \bA \|_F. \]

Similarly writing \(\bA\) as

\[\begin{split} \bA = \begin{bmatrix} \br_1 \\ \vdots \\ \br_m \end{bmatrix}. \end{split}\]

we have

\[\begin{split} \bA \bV = \begin{bmatrix} \br_1 \bV\\ \vdots \\ \br_m \bV \end{bmatrix}. \end{split}\]

Then applying Lemma 4.88 clearly

\[ \| \bA \bV \|_F^2 = \sum_{i=1}^{m} \| \br_i \bV \|_2^2. \]

But we know that unitary matrices are norm preserving. Hence

\[ \|\br_i \bV \|_2^2 = \|\br_i \|_2^2. \]

Thus

\[ \| \bA \bV \|_F^2 = \sum_{i=1}^{m} \| \br_i \|_2^2 = \| \bA \|_F^2 \]

which implies

\[ \| \bA \bV \|_F = \| \bA \|_F. \]

An alternative approach for the 2nd part of the proof using the first part is just one line

\[ \| \bA \bV \|_F = \| (\bA \bV)^H \|_F = \| \bV^H \bA^H \|_F = \| \bA^H \|_F = \| \bA \|_F. \]

In above we use Lemma 4.86 and the fact that \(\bV\) is a unitary matrix implies that \(\bV^H\) is also a unitary matrix. We have already shown that pre multiplication by a unitary matrix preserves Frobenius norm.

Theorem 4.142 (Consistency of Frobenius norm)

Let \(\bA \in \CC^{m \times n}\) and \(\bB \in \CC^{n \times p}\) be two matrices. Then the Frobenius norm of their product is less than or equal to the product of Frobenius norms of the matrices themselves. i.e.

\[ \| \bA \bB \|_F \leq \| \bA \|_F \| \bB \|_F. \]

Proof. We can write \(\bA\) as

\[\begin{split} \bA = \begin{bmatrix} \ba_1^T \\ \vdots \\ \ba_m^T \end{bmatrix} \end{split}\]

where \(\ba_i\) are \(m\) column vectors corresponding to rows of \(\bA\). Similarly we can write \(\bB\) as

\[ \bB = \begin{bmatrix} \bb_1 & \dots & \bb_p \end{bmatrix} \]

where \(\bb_i\) are column vectors corresponding to columns of \(\bB\).

Then

\[\begin{split} \bA \bB = \begin{bmatrix} \ba_1^T \\ \vdots \\ \ba_m^T \end{bmatrix} \begin{bmatrix} \bb_1 & \dots & \bb_p \end{bmatrix} = \begin{bmatrix} \ba_1^T \bb_1 & \dots & \ba_1^T \bb_p\\ \vdots & \ddots & \vdots \\ \ba_m^T \bb_1 & \dots & \ba_m^T \bb_p \end{bmatrix} = \begin{bmatrix} \ba_i^T \bb_j \end{bmatrix} . \end{split}\]

Now looking carefully

\[ \ba_i^T \bb_j = \langle \ba_i, \overline{\bb_j} \rangle. \]

Applying the Cauchy-Schwartz inequality we have

\[ | \langle \ba_i, \overline{\bb_j} \rangle |^2 \leq \| \ba_i \|_2^2 \| \overline{\bb_j} \|_2^2 = \| \ba_i \|_2^2 \| \bb_j \|_2^2. \]

Now

\[\begin{split} \| \bA \bB \|_F^2 &= \sum_{i=1}^{m} \sum_{j=1}^{p} | \ba_i^T \bb_j |^2\\ &\leq \sum_{i=1}^{m} \sum_{j=1}^{p} \| \ba_i \|_2^2 \| \bb_j \|_2^2\\ &= \left ( \sum_{i=1}^{m} \| \ba_i \|_2^2 \right ) \left ( \sum_{j=1}^{p} \| \bb_j \|_2^2\right )\\ &= \| \bA \|_F^2 \| \bB \|_F^2 \end{split}\]

which implies

\[ \| \bA \bB \|_F \leq \| \bA \|_F \| \bB \|_F \]

by taking square roots on both sides.

Corollary 4.28

Let \(\bA \in \CC^{m \times n}\) and let \(\bx \in \CC^n\). Then

\[ \| \bA \bx \|_2 \leq \| \bA \|_F \| \bx \|_2. \]

Proof. We note that Frobenius norm for a column matrix is same as \(\ell_2\) norm for corresponding column vector. i.e.

\[ \| \bx \|_F = \| \bx \|_2 \Forall \bx \in \CC^n. \]

Now applying Theorem 4.142 we have

\[ \| \bA \bx \|_2 = \| \bA \bx \|_F \leq \| \bA \|_F \| \bx \|_F = \| \bA \|_F \| \bx \|_2 \Forall \bx \in \CC^n. \]

It turns out that Frobenius norm is intimately related to the singular value decomposition of a matrix.

Lemma 4.89 (Frobenius norm and singular values)

Let \(\bA \in \CC^{m \times n}\). Let the singular value decomposition of \(\bA\) be given by

\[ \bA = \bU \Sigma \bV^H. \]

Let the singular values of \(\bA\) be \(\sigma_1, \dots, \sigma_k\) where \(k = \min(m, n)\). Then

\[ \| \bA \|_F = \sqrt {\sum_{i=1}^k \sigma_i^2}. \]

Proof. This comes from the invariance of Frobenius norm with unitary transformations.

\[ \bA = \bU \Sigma \bV^H \implies \|A \|_F = \| \bU \Sigma \bV^H \|_F. \]

But

\[ \| \bU \Sigma \bV^H \|_F = \| \Sigma \bV^H \|_F = \| \Sigma \|_F \]

since \(\bU\) and \(\bV\) are unitary matrices (see Theorem 4.141 ). Now the only nonzero terms in \(\Sigma\) are the singular values. Hence

\[ \| \bA \|_F = \| \Sigma \|_F = \sqrt {\sum_{i=1}^k \sigma_i^2}. \]

4.12.3. Consistency of a Matrix Norm#

Definition 4.143 (Consistent matrix norm)

A matrix norm \(\| \cdot \|\) is called consistent on \(\CC^{n \times n}\) if

(4.19)#\[\| \bA \bB \| \leq \| \bA \| \| \bB \| \]

holds true for all \(\bA, \bB \in \CC^{n \times n}\). A matrix norm \(\| \cdot \|\) is called consistent if it is defined on \(\CC^{m \times n}\) for all \(m, n \in \Nat\) and (4.19) holds for all matrices \(\bA, \bB\) for which the product \(\bA \bB\) is defined.

A consistent matrix norm is also known as a sub-multiplicative norm.

With this definition and results in Theorem 4.142 we can see that Frobenius norm is consistent.

4.12.4. Subordinate Matrix Norm#

A matrix operates on vectors from one space to generate vectors in another space. It is interesting to explore the connection between the norm of a matrix and norms of vectors in the domain and co-domain of a matrix.

Definition 4.144

Let \(m, n \in \Nat\) be given. Let \(\| \cdot \|_{\alpha}\) be some norm on \(\CC^m\) and \(\| \cdot \|_{\beta}\) be some norm on \(\CC^n\). Let \(\| \cdot \|\) be some norm on matrices in \(\CC^{m \times n}\). We say that \(\| \cdot \|\) is subordinate to the vector norms \(\| \cdot \|_{\alpha}\) and \(\| \cdot \|_{\beta}\) if

\[ \| \bA \bx \|_{\alpha} \leq \| \bA \| \| \bx \|_{\beta} \]

for all \(\bA \in \CC^{m \times n}\) and for all \(\bx \in \CC^n\). In other words the length of the vector doesn’t increase by the operation of \(\bA\) beyond a factor given by the norm of the matrix itself.

If \(\| \cdot \|_{\alpha}\) and \(\| \cdot \|_{\beta}\) are same then we say that \(\| \cdot \|\) is subordinate to the vector norm \(\| \cdot \|_{\alpha}\).

We have shown earlier in Corollary 4.28 that Frobenius norm is subordinate to Euclidean norm.

4.12.5. Operator Norm#

We now consider the maximum factor by which a matrix \(\bA\) can increase the length of a vector.

Definition 4.145 (Operator norm)

Let \(m, n \in \Nat\) be given. Let \(\| \cdot \|_{\alpha}\) be some norm on \(\CC^n\) and \(\| \cdot \|_{\beta}\) be some norm on \(\CC^m\). For \(\bA \in \CC^{m \times n}\) we define

\[ \| \bA \| \triangleq \| \bA \|_{\alpha \to \beta} \triangleq \underset{\bx \neq 0}{\max } \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}}. \]

The term \(\frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}}\) represents the factor with which the length of \(\bx\) increased by operation of \(\bA\). We simply pick up the maximum value of such scaling factors over all nonzero \(\bx\).

The norm as defined above is known as \((\alpha \to \beta)\) operator norm, the \((\alpha \to \beta)\)-norm, or simply the \(\alpha\)-norm if \(\alpha = \beta\).

We need to verify that this definition satisfies all properties of a norm.

Nonnegativity

  1. Clearly if \(\bA = \ZERO\) then \(\bA \bx = \bzero\) always, hence \(\| \bA \| = 0\).

  2. Conversely, if \(\| \bA \| = 0\) then \(\| \bA \bx \|_{\beta} = 0 \Forall \bx \in \CC^n\).

  3. Hence \(\bA \bx = \bzero\) for every \(\bx \in \CC^n\).

  4. In particular this is true for the unit vectors \(\be_i \in \CC^n\).

  5. The \(i\)-th column of \(\bA\) is given by \(\bA \be_i\) which is \(\bzero\).

  6. Thus each column in \(\bA\) is \(\bzero\).

  7. Hence \(\bA = \ZERO\).

Homogeneity

  1. Consider \(c \in \CC\).

  2. Then

    \[ \| c \bA \| = \underset{\bx \neq 0}{\max } \frac{\| c \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}} = | c | \underset{\bx \neq 0}{\max } \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}} = | c | \| \bA \|. \]

We now present some useful observations on operator norm before we can prove triangle inequality for operator norm.

  1. For any \(\bx \in \Kernel(\bA)\), \(\bA \bx = \bzero\) hence we only need to consider vectors which don’t belong to the kernel of \(\bA\).

  2. Thus we can write

    \[ \| \bA \|_{\alpha \to \beta} = \underset{\bx \notin \Kernel(\bA)} {\max } \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}}. \]
  3. We also note that

    \[ \frac{\| \bA c \bx \|_{\beta}}{\| c \bx \|_{\alpha}} = \frac{| c | \| \bA \bx \|_{\beta}}{ | c | \| \bx \|_{\alpha}} = \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}} \Forall c \neq 0, \bx \neq \bzero. \]
  4. Thus, it is sufficient to find the maximum on unit norm vectors:

    \[ \| \bA \|_{\alpha \to \beta} = \underset{\| \bx \|_{\alpha} = 1} {\max } \| \bA \bx \|_{\beta}. \]
  5. Note that since \(\|\bx \|_{\alpha} = 1\) hence the term in denominator goes away.

Lemma 4.90 (Operator norm is subordinate)

The \((\alpha \to \beta)\)-operator norm is subordinate to vector norms \(\| \cdot \|_{\alpha}\) and \(\| \cdot \|_{\beta}\). i.e.

\[ \| \bA \bx \|_{\beta} \leq \| \bA \|_{\alpha \to \beta } \| \bx \|_{\alpha}. \]

Proof. For \(\bx = \bzero\) the inequality is trivially satisfied. For \(\bx \neq \bzero\), by definition, we have

\[ \| \bA \|_{\alpha \to \beta } \geq \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}} \implies \| \bA \|_{\alpha \to \beta } \| \bx \|_{\alpha} \geq \| \bA \bx \|_{\beta}. \]

Theorem 4.143 (Existence of the maximizer for operator norm)

There exists a vector \(\bx^* \in \CC^{n}\) with unit norm (\(\| \bx^* \|_{\alpha} = 1\)) such that

\[ \| \bA \|_{\alpha \to \beta} = \| \bA \bx^* \|_{\beta}. \]

Proof. Recall that

\[ \| \bA \|_{\alpha \to \beta} = \underset{\| \bx \|_{\alpha} = 1} {\max } \| \bA \bx \|_{\beta}. \]
  1. The norm function \(\| \cdot \|_{\beta}\) is continuous on \(\CC^m\).

  2. The mapping \(\bx \mapsto \bA \bx\) is continuous.

  3. Hence the function \(\bx \mapsto \| \bA \bx \|_{\beta}\) is continuous.

  4. The set \(\{ \bx \in \CC^n \ST \| \bx \|_{\alpha} = 1\}\) is compact.

  5. Hence, the function attains a supremum value over this set.

We are now ready to prove triangle inequality for operator norm.

Lemma 4.91 (Triangle inequality for operator norm)

Operator norm as defined in Definition 4.145 satisfies triangle inequality.

Proof. Let \(\bA\) and \(\bB\) be some matrices in \(\CC^{m \times n}\).

  1. Consider the operator norm of matrix \(\bA + \bB\).

  2. From previous remarks, there exists some vector \(\bx^* \in \CC^n\) with \(\| \bx^* \|_{\alpha} = 1\) such that

    \[ \| \bA + \bB \| = \| (\bA + \bB) \bx^* \|_{\beta}. \]
  3. Now

    \[ \| (\bA + \bB) \bx^* \|_{\beta} = \| \bA \bx^* + \bB \bx^* \|_{\beta} \leq \| \bA \bx^*\|_{\beta} + \| \bB \bx^*\|_{\beta}. \]
  4. From subordinate property of operator norm, we have

    \[ \| \bA \bx^*\|_{\beta} \leq \| \bA \| \|\bx^*\|_{\alpha} = \| \bA \| \]

    and

    \[ \| \bB \bx^*\|_{\beta} \leq \| \bB \| \|\bx^*\|_{\alpha} = \|B \| \]

    since \(\| \bx^* \|_{\alpha} = 1\).

  5. Hence we have

    \[ \| \bA + \bB \| \leq \| \bA \| + \| \bB \|. \]

It turns out that operator norm is also consistent under certain conditions.

Lemma 4.92 (Consistency of operator norm)

Let \(\| \cdot \|_{\alpha}\) be defined over all \(m \in \Nat\). Let \(\| \cdot \|_{\beta} = \| \cdot \|_{\alpha}\). Then the operator norm

\[ \| \bA \|_{\alpha} = \underset{\bx \neq \bzero}{\max } \frac{\| \bA \bx \|_{\alpha}}{\| \bx \|_{\alpha}} \]

is consistent.

Proof. We need to show that

\[ \| \bA \bB \|_{\alpha} \leq \| \bA \|_{\alpha} \| \bB \|_{\alpha}. \]
  1. Now

    \[ \| \bA \bB \|_{\alpha} = \underset{\bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]
  2. We note that if \(\bB \bx = \bzero\), then \(\bA \bB \bx = \bzero\).

  3. Hence we can rewrite as

    \[ \| \bA \bB \|_{\alpha} = \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]
  4. Now if \(\bB \bx \neq \bzero\) then \( \| \bB \bx \|_{\alpha} \neq 0\).

  5. Hence

    \[ \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}} = \frac{\| \bA \bB \bx \|_{\alpha}}{\|\bB \bx \|_{\alpha}} \frac{\| \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}} \]

    and

    \[ \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}} \leq \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\|\bB \bx \|_{\alpha}} \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]
  6. Clearly

    \[ \| \bB \|_{\alpha} = \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]
  7. Furthermore

    \[ \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\|\bB \bx \|_{\alpha}} \leq \underset{\by \neq \bzero}{\max } \frac{\| \bA \by \|_{\alpha}}{\| \by \|_{\alpha}} = \|\bA \|_{\alpha}. \]
  8. Thus we have

    \[ \| \bA \bB \|_{\alpha} \leq \| \bA \|_{\alpha} \| \bB \|_{\alpha}. \]

4.12.6. \(p\)-norm for Matrices#

We recall the definition of \(\ell_p\) norms for vectors \(\bx \in \CC^n\)

\[\begin{split} \| \bx \|_p = \begin{cases} \left ( \sum_{i=1}^{n} | \bx |_i^p \right ) ^ {\frac{1}{p}} & p \in [1, \infty)\\ \underset{1 \leq i \leq n}{\max} |x_i| & p = \infty \end{cases}. \end{split}\]

The operator norms \(\| \cdot \|_p\) defined from \(\ell_p\) vector norms are of specific interest.

Definition 4.146 (Matrix \(p\)-norm)

The \(p\)-norm for a matrix \(\bA \in \CC^{m \times n}\) is defined as

\[ \| \bA \|_p \triangleq \underset{\bx \neq \bzero}{\max } \frac{\| \bA \bx \|_p}{\| \bx \|_p} = \underset{\| \bx \|_p = 1}{\max } \| \bA \bx \|_p \]

where \(\| \bx \|_p\) is the standard \(\ell_p\) norm for vectors in \(\CC^m\) and \(\CC^n\).

As per Lemma 4.92 \(p\)-norms for matrices are consistent norms. They are also sub-ordinate to \(\ell_p\) vector norms.

Special cases are considered for \(p=1,2\) and \(\infty\).

Theorem 4.144 (Max column sum, max row sum, spectral norms)

Let \(\bA \in \CC^{m \times n}\).

For \(p=1\) we have

\[ \| \bA \|_1 = \underset{1\leq j \leq n}{\max} \sum_{i=1}^m | a_{i j}|. \]

This is also known as max column sum norm.

For \(p=\infty\) we have

\[ \| \bA \|_{\infty} = \underset{1\leq i \leq m}{\max} \sum_{j=1}^n | a_{i j}|. \]

This is also known as max row sum norm.

Finally for \(p=2\) we have

\[ \| \bA \|_2 = \sigma_1 \]

where \(\sigma_1\) is the largest singular value of \(\bA\). This is also known as spectral norm.

Proof. Max column sum norm:

  1. Let

    \[ \bA = \begin{bmatrix} \ba^1 & \dots, & \ba^n \end{bmatrix}. \]
  2. Then

    \[\begin{split} \begin{aligned} \| \bA \bx \|_1 &= \left \| \sum_{j=1}^n x_j \ba^j \right \|_1 \\ &\leq \sum_{j=1}^n \left \| x_j \ba^j \right \|_1 \\ &= \sum_{j=1}^n |x_j| \left \| \ba^j \right \|_1 \\ &\leq \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1 \sum_{j=1}^n |x_j| \\ &= \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1 \| \bx \|_1. \end{aligned} \end{split}\]
  3. Thus,

    \[ \| \bA \|_1 = \underset{\bx \neq \bzero}{\max } \frac{\| \bA \bx \|_1}{\| \bx \|_1} \leq \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1 \]

    which the maximum column sum.

  4. We need to show that this upper bound is indeed an equality.

  5. Indeed for any \(\bx=\be_j\) where \(\be_j\) is a unit vector with \(1\) in \(j\)-th entry and 0 elsewhere,

    \[ \| \bA \be_j \|_1 = \| \ba^j \|_1. \]
  6. Thus

    \[ \| \bA \|_1 \geq \| \ba^j \|_1 \quad \Forall 1 \leq j \leq n. \]
  7. Combining the two, we see that

    \[ \| \bA \|_1 = \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1. \]

Max row sum norm:

  1. For \(p=\infty\), we proceed as follows.

    \[\begin{split} \begin{aligned} \| \bA \bx \|_{\infty} &= \underset{1 \leq i \leq m}{\max} \left | \sum_{j=1}^n a_{i j } x_j \right | \\ & \leq \underset{1 \leq i \leq m}{\max} \sum_{j=1}^n | a_{i j } | | x_j |\\ & \leq \underset{1 \leq j \leq n}{\max} | x_j | \underset{1 \leq i \leq m}{\max} \sum_{j=1}^n | a_{i j } |\\ &= \| \bx \|_{\infty} \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1 \end{aligned} \end{split}\]

    where \(\underline{\ba}^i\) are the rows of \(\bA\).

  2. This shows that

    \[ \| \bA \bx \|_{\infty} \leq \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1. \]
  3. We need to show that this is indeed an equality.

  4. Fix an \(i = k\) and choose \(\bx\) such that

    \[ x_j = \sgn (a_{k j}). \]
  5. Clearly \(\| \bx \|_{\infty} = 1\).

  6. Then

    \[\begin{split} \begin{aligned} \| \bA \bx \|_{\infty} &= \underset{1 \leq i \leq m}{\max} \left | \sum_{j=1}^n a_{i j } x_j \right | \\ &\geq \left | \sum_{j=1}^n a_{k j } x_j \right | \\ &= \left | \sum_{j=1}^n | a_{k j } | \right | \\ &= \sum_{j=1}^n | a_{k j } |\\ &= \| \underline{\ba}^k \|_1. \end{aligned} \end{split}\]
  7. Thus,

    \[ \| \bA \|_{\infty} \geq \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1. \]
  8. Combining the two inequalities we get:

    \[ \| \bA \|_{\infty} = \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1. \]

Spectral norm:

  1. Remaining case is for \(p=2\).

  2. For any vector \(\bx\) with \(\| \bx \|_2 = 1\),

    \[ \| \bA \bx \|_2 = \| \bU \Sigma \bV^H \bx \|_2 = \| \bU (\Sigma \bV^H \bx )\|_2 = \| \Sigma \bV^H \bx \|_2 \]

    since \(\ell_2\) norm is invariant to unitary transformations.

  3. Let \(\bv = \bV^H \bx\).

  4. Then \(\|\bv\|_2 = \| \bV^H \bx \|_2 = \| \bx \|_2 = 1\).

  5. Let \(k = \min(m, n)\).

  6. Now

    \[\begin{split} \| \bA \bx \|_2 &= \| \Sigma \bv \|_2\\ &= \left ( \sum_{j=1}^k | \sigma_j v_j |^2 \right )^{\frac{1}{2}}\\ &\leq \sigma_1 \left ( \sum_{j=1}^k | v_j |^2 \right )^{\frac{1}{2}}\\ &= \sigma_1 \| \bv \|_2 = \sigma_1. \end{split}\]
  7. This shows that

    \[ \| \bA \|_2 \leq \sigma_1. \]
  8. Now consider some vector \(\bx\) such that \(\bv = (1, 0, \dots, 0)\).

  9. Then

    \[ \| \bA \bx \|_2 = \| \Sigma \bv \|_2 = \sigma_1. \]
  10. Thus

    \[ \| \bA \|_2 \geq \sigma_1. \]
  11. Combining the two, we get that \(\| \bA \|_2 = \sigma_1\).

4.12.7. The \(2\)-norm#

Theorem 4.145

Let \(\bA \in \CC^{n \times n}\) have singular values \(\sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_n\). Let the eigen values for \(\bA\) be \(\lambda_1, \lambda_2, \dots, \lambda_n\) with \(|\lambda_1| \geq |\lambda_2| \geq \dots \geq |\lambda_n|\). Then the following hold

\[ \| \bA \|_2 = \sigma_1 \]

and if \(\bA\) is nonsingular, then

\[ \| \bA^{-1} \|_2 = \frac{1}{\sigma_n}. \]

If \(\bA\) is symmetric and positive definite, then

\[ \| \bA \|_2 = \lambda_1 \]

and if \(\bA\) is nonsingular, then

\[ \| \bA^{-1} \|_2 = \frac{1}{\lambda_n}. \]

If \(\bA\) is normal then

\[ \| \bA \|_2 = |\lambda_1| \]

and if \(\bA\) is nonsingular, then

\[ \| \bA^{-1} \|_2 = \frac{1}{|\lambda_n|}. \]

4.12.8. Unitary Invariant Norms#

Definition 4.147 (Unitary invariant norm)

A matrix norm \(\| \cdot \|\) on \(\CC^{m \times n}\) is called unitary invariant if \(\| \bU \bA \bV \| = \| \bA \|\) for any \(\bA \in \CC^{m \times n}\) and any unitary matrices \(\bU \in \CC^{m \times m}\) and \(\bV \in \CC^{n \times n}\).

We have already seen in Theorem 4.141 that Frobenius norm is unitary invariant.

It turns out that spectral norm is also unitary invariant.

4.12.9. More Properties of Operator Norms#

In this section we will focus on operator norms connecting normed linear spaces \((\CC^n, \| \cdot \|_{p})\) and \((\CC^m, \| \cdot \|_{q})\). Typical values of \(p, q\) would be in \(\{1, 2, \infty\}\).

We recall that

\[ \| \bA \|_{p \to q } = \underset{\bx \neq \bzero}{\max} \frac{\| \bA \bx \|_q}{\| \bx \|_p} = \underset{ \| \bx \|_p = 1}{\max} \| \bA \bx \|_q = \underset{\| \bx \|_p \leq 1}{\max} \| \bA \bx \|_q. \]

The table below [78] shows how to compute different \((p, q)\) norms. Some can be computed easily while others are NP-hard to compute.

Table 4.1 Typical \((p \to q)\) norms#

\(p\)

\(q\)

\(\| \bA \|_{p \to q}\)

Calculation

1

1

\(\| \bA \|_{1 }\)

Maximum \(l_1\) norm of a column

1

2

\(\| \bA \|_{1 \to 2}\)

Maximum \(l_2\) norm of a column

1

\(\infty\)

\(\| \bA \|_{1 \to \infty}\)

Maximum absolute entry of a matrix

2

1

\(\| \bA \|_{2 \to 1}\)

NP hard

2

2

\(\| \bA \|_{2}\)

Maximum singular value

2

\(\infty\)

\(\| \bA \|_{2 \to \infty}\)

Maximum \(l_2\) norm of a row

\(\infty\)

1

\(\| \bA \|_{\infty \to 1}\)

NP hard

\(\infty\)

2

\(\| \bA \|_{\infty \to 2}\)

NP hard

\(\infty\)

\(\infty\)

\(\| \bA \|_{\infty}\)

Maximum \(l_1\)-norm of a row

The topological dual of the finite dimensional normed linear space \((\CC^n, \| \cdot \|_{p})\) is the normed linear space \((\CC^n, \| \cdot \|_{p'})\) where

\[ \frac{1}{p} + \frac{1}{p'} = 1. \]
  • \(\ell_2\)-norm is dual of \(l_2\)-norm. It is a self dual.

  • \(\ell_1\) norm and \(\ell_{\infty}\)-norm are dual of each other.

When a matrix \(\bA\) maps from the space \((\CC^n, \| \cdot \|_{p})\) to the space \((\CC^m, \| \cdot \|_{q})\), we can view its conjugate transpose \(\bA^H\) as a mapping from the space \((\CC^m, \| \cdot \|_{q'})\) to \((\CC^n, \| \cdot \|_{p'})\).

Theorem 4.146 (Operator norm and conjugate transpose)

Operator norm of a matrix always equals the operator norm of its conjugate transpose. In other words

\[ \| \bA \|_{p \to q} = \| \bA^H \|_{q' \to p'} \]

where

\[ \frac{1}{p} + \frac{1}{p'} = 1, \frac{1}{q} + \frac{1}{q'} = 1. \]

Specific applications of this result are:

\[ \| \bA \|_2 = \| \bA^H \|_2. \]

This is obvious since the maximum singular value of a matrix and its conjugate transpose are same.

\[ \| \bA \|_1 = \| \bA^H \|_{\infty}, \quad \| \bA \|_{\infty} = \| \bA^H \|_1. \]

This is also obvious since max column sum of \(\bA\) is same as the max row sum norm of \(\bA^H\) and vice versa.

\[ \| \bA \|_{1 \to \infty} = \| \bA^H \|_{1 \to \infty}. \]
\[ \| \bA \|_{1 \to 2} = \| \bA^H \|_{2 \to \infty}. \]
\[ \| \bA \|_{\infty \to 2} = \| \bA^H \|_{2 \to 1}. \]

We now need to show the result for the general case (arbitrary \(1 \leq p, q \leq \infty\)).

Proof. TODO

Theorem 4.147 (\(1 \to p\) norm)

\[ \| \bA \|_{1 \to p} = \underset{1 \leq j \leq n}{\max}\| \ba^j \|_p. \]

where

\[ \bA = \begin{bmatrix} \ba^1 & \dots, & \ba^n \end{bmatrix}. \]

Proof. Expanding:

\[\begin{split} \| \bA \bx \|_p &= \left \| \sum_{j=1}^n x_j \ba^j \right \|_p \\ &\leq \sum_{j=1}^n \left \| x_j \ba^j \right \|_p \\ &= \sum_{j=1}^n |x_j| \left \| \ba^j \right \|_p \\ &\leq \underset{1 \leq j \leq n}{\max}\| \ba^j \|_p \sum_{j=1}^n |x_j| \\ &= \underset{1 \leq j \leq n}{\max}\| \ba^j \|_p \| \bx \|_1. \end{split}\]

Thus,

\[ \| \bA \|_{1 \to p} = \underset{\bx \neq \bzero}{\max } \frac{\| \bA \bx \|_p}{\| \bx \|_1} \leq \underset{1 \leq j \leq n}{\max}\| \ba^j \|_p. \]

We need to show that this upper bound is indeed an equality.

Indeed for any \(\bx=\be_j\) where \(\be_j\) is a unit vector with \(1\) in \(j\)-th entry and 0 elsewhere,

\[ \| \bA \be_j \|_p = \| \ba^j \|_p. \]

Thus

\[ \| \bA \|_{1 \to p} \geq \| \ba^j \|_p \quad \Forall 1 \leq j \leq n. \]

Combining the two, we see that

\[ \| \bA \|_{1 \to p} = \underset{1 \leq j \leq n}{\max}\| \ba^j \|_p. \]

Theorem 4.148 (\(p \to \infty\) norm)

\[ \| \bA \|_{p \to \infty} = \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_q \]

where

\[ \frac{1}{p} + \frac{1}{q} = 1. \]

Proof. Using Theorem 4.146, we get

\[ \| \bA \|_{p \to \infty} = \| \bA^H \|_{1 \to q}. \]

Using Theorem 4.147, we get

\[ \| \bA^H \|_{1 \to q} = \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_q. \]

This completes the proof.

Theorem 4.149 (Consistency of \(p \to q\) norms)

For two matrices \(\bA\) and \(\bB\) with \(p,q,s \geq 1\), we have

\[ \| \bA \bB \|_{p \to q} \leq \| \bB \|_{p \to s} \| \bA \|_{s \to q}. \]

Proof. We start with

\[ \| \bA \bB \|_{p \to q} = \underset{\| \bx \|_p = 1}{\max} \| \bA ( \bB \bx) \|_q. \]

From Lemma 4.90, we obtain

\[ \| \bA ( \bB \bx) \|_q \leq \| \bA \|_{s \to q} \| ( \bB \bx) \|_s. \]

Thus,

\[ \| \bA \bB \|_{p \to q} \leq \| \bA \|_{s \to q} \underset{\| \bx \|_p = 1}{\max} \| ( \bB \bx) \|_s = \| \bA \|_{s \to q} \| \bB \|_{p \to s}. \]

Theorem 4.150 (Consistency of \(p \to \infty\) norms)

For two matrices \(\bA\) and \(\bB\) and \(p \geq 1\), we have

\[ \| \bA \bB \|_{p \to \infty} \leq \| \bA \|_{\infty} \| \bB \|_{p \to \infty}. \]

Proof. We start with

\[ \| \bA \bB \|_{p \to \infty} = \underset{\| \bx \|_p = 1}{\max} \| \bA ( \bB \bx) \|_{\infty}. \]

From Lemma 4.90, we obtain

\[ \| \bA ( \bB \bx) \|_{\infty} \leq \| \bA \|_{\infty \to \infty} \| ( \bB \bx) \|_{\infty}. \]

Thus,

\[ \| \bA \bB \|_{p \to \infty} \leq \| \bA \|_{\infty \to \infty} \underset{\| \bx \|_p = 1}{\max} \| ( \bB \bx) \|_{\infty} = \| \bA \|_{\infty} \| \bB \|_{p \to \infty}. \]

Theorem 4.151 (Dominance of \(p\)-norm on \(p \to \infty\) norm)

\[ \| \bA \|_{p \to \infty} \leq \| \bA \|_{p}. \]

In particular

\[ \| \bA \|_{1 \to \infty} \leq \| \bA \|_{1}. \]

Also:

\[ \| \bA \|_{2 \to \infty} \leq \| \bA \|_{2}. \]

Proof. Choosing \(q = \infty\) and \(s = p\) and applying Theorem 4.149

\[ \| \bI \bA \|_{p \to \infty} \leq \| \bA \|_{p \to p} \| \bI \|_{p \to \infty}. \]

But \(\| \bI \|_{p \to \infty}\) is the maximum \(\ell_p\) norm of any row of \(\bI\) which is \(1\). Thus

\[ \| \bA \|_{p \to \infty} \leq \| \bA \|_{p \to p}. \]

Consider the expression

\[\begin{split} \underset{ \substack{\bz \in \ColSpace(\bA^H) \\ \bz \neq \bzero}}{\min} \frac{\| \bA \bz \|_{q}}{\| \bz \|_p}. \end{split}\]

The constraint \(\bz \in \ColSpace(\bA^H), \bz \neq \bzero\) means there exists some vector \(\bu \notin \Kernel(\bA^H)\) such that \(\bz = \bA^H \bu\).

This expression measures the factor by which the non-singular part of \(\bA\) can change the length of a vector.

Theorem 4.152

The following bound holds for every matrix \(\bA\):

\[\begin{split} \underset{\substack{\bz \in \ColSpace(\bA^H)\\ \bz \neq \bzero}}{\min} \frac{\| \bA \bz \|_{q}}{\| \bz \|_p} \geq \| \bA^{\dag}\|_{q \to p}^{-1}. \end{split}\]

If \(\bA\) is surjective (onto), then the equality holds. When \(\bA\) is bijective (one-one onto, square, invertible), then the result implies

\[\begin{split} \underset{\substack{\bz \in \ColSpace(\bA^H) \\ \bz \neq \bzero}}{\min} \frac{\| \bA \bz \|_{q}}{\| \bz \|_p} = \| \bA^{-1}\|_{q \to p}^{-1}. \end{split}\]

Proof. The spaces \(\ColSpace(\bA^H)\) and \(\ColSpace(\bA)\) have same dimensions given by \(\Rank(\bA)\).

  1. We recall that \(\bA^{\dag} \bA\) is a projector onto the column space of \(\bA^H\).

    \[ \bw = \bA \bz \iff \bz = \bA^{\dag} \bw = \bA^{\dag} \bA \bz \Forall \bz \in \ColSpace (\bA^H). \]
  2. As a result we can write

    \[ \frac{\| \bz \|_p}{ \| \bA \bz \|_q} = \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} \]

    whenever \(\bz \in \ColSpace(\bA^H)\) and \(\bz \neq \bzero\).

  3. Now

    \[\begin{split} \left [ \underset{\substack{\bz \in \ColSpace(\bA^H)\\ \bz \neq \bzero}}{\min} \frac{\| \bA \bz \|_q}{\| \bz \|_p}\right ]^{-1} = \underset{\substack{\bz \in \ColSpace(\bA^H)\\ \bz \neq \bzero}}{\max} \frac{\| \bz \|_p}{ \| \bA \bz \|_q} = \underset{\substack{\bw \in \ColSpace(\bA) \\ \bw \neq \bzero}}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} \leq \underset{\bw \neq \bzero}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q}. \end{split}\]
  4. When \(\bA\) is surjective, then \(\ColSpace(\bA) = \CC^m\).

  5. Hence

    \[\begin{split} \underset{\substack{\bw \in \ColSpace(\bA)\\ \bw \neq \bzero}}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} = \underset{\bw \neq \bzero}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q}. \end{split}\]
  6. Thus, the inequality changes into equality.

  7. Finally

    \[ \underset{\bw \neq \bzero}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} = \| \bA^{\dag} \|_{q \to p} \]

    which completes the proof.

4.12.10. Row Column Norms#

A common way of measuring the norm of a matrix is to compute the \(\ell_p\) norm for each row and then compute the \(\ell_q\) norm of the resultant column vector. Such norms are known as row column norms.

  1. If \(p=q=2\), then the resultant norm is same as the Frobenius norm.

  2. If \(p=q=1\), then the resultant norm is same as the sum norm.

  3. If \(p=q=\infty\), then the resultant norm is same as the max norm.

Thus, one can think of the row column norms as a generalization of these norms.

Definition 4.148 (Row-column norms)

Let \(\bA\) be an \(m\times n\) matrix with rows \(\underline{\ba}^i\) as

\[\begin{split} \bA = \begin{bmatrix} \underline{\ba}^1\\ \vdots \\ \underline{\ba}^m \end{bmatrix} \end{split}\]

Then we define

\[ \| \bA \|_{p, \infty} \triangleq \underset{1 \leq i \leq m}{\max} \| \underline{\ba}^i \|_p = \underset{1 \leq i \leq m}{\max} \left ( \sum_{j=1}^n |\underline{a}^i_j |^p \right )^{\frac{1}{p}} \]

where \(1 \leq p < \infty\). In other words, we take \(\ell_p\)-norms of all row vectors and then find the maximum.

We define

\[ \| \bA \|_{\infty, \infty} = \underset{i, j}{\max} |a_{i j}|. \]

This is equivalent to taking \(\ell_{\infty}\) norm on each row and then taking the maximum of all the norms.

For \(1 \leq p , q < \infty\), we define the norm

\[ \| \bA \|_{p, q} \triangleq \left [ \sum_{i=1}^m \left ( \| \underline{\ba}^i \|_p \right )^q \right ]^{\frac{1}{q}}. \]

In other words, we compute \(\ell_p\)-norm of all the row vectors to form another vector and then take \(\ell_q\)-norm of that vector.

Note that the norm \(\| \bA \|_{p, \infty}\) is different from the operator norm \(\| \bA \|_{p \to \infty}\). Similarly \(\| \bA \|_{p, q}\) is different from \(\| \bA \|_{p \to q}\).

Theorem 4.153

\[ \| \bA \|_{p, \infty} = \| \bA \|_{q \to \infty} \]

where

\[ \frac{1}{p} + \frac{1}{q} = 1. \]

Proof. From Theorem 4.148 we get

\[ \| \bA \|_{q \to \infty} = \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_p. \]

This is exactly the definition of \(\| \bA \|_{p, \infty}\).

Theorem 4.154

\[ \| \bA \|_{1 \to p} = \| \bA^H \|_{p, \infty}. \]

Proof. We note that:

\[ \| \bA \|_{1 \to p} = \| \bA^H \|_{q \to \infty}. \]

From Theorem 4.153

\[ \| \bA^H \|_{q \to \infty} = \| \bA^H \|_{p, \infty}. \]

Theorem 4.155

For any two matrices \(\bA, \bB\), we have

\[ \frac{\| \bA \bB \|_{p, \infty}}{\| \bB \|_{p, \infty}} \leq \| \bA \|_{\infty \to \infty}. \]

Proof. Let \(q\) be such that \(\frac{1}{p} + \frac{1}{q} = 1\).

  1. From Theorem 4.150, we have

    \[ \| \bA \bB \|_{q \to \infty} \leq \| \bA \|_{\infty \to \infty} \| \bB \|_{q \to \infty}. \]
  2. From Theorem 4.153

    \[ \| \bA \bB \|_{q \to \infty} = \| \bA \bB\|_{p, \infty} \]

    and

    \[ \| \bB \|_{q \to \infty} = \| \bB \|_{p, \infty}. \]
  3. Thus

    \[ \| \bA \bB\|_{p, \infty} \leq \| \bA \|_{\infty \to \infty} \| \bB \|_{p, \infty}. \]

Theorem 4.156 (Relations between \((p,q)\) and \((p \to q)\) norms)

Relations between \((p, q)\) norms and \((p \to q)\) norms

\[ \| \bA \|_{1, \infty} = \| \bA \|_{\infty \to \infty}. \]
\[ \| \bA \|_{2, \infty} = \| \bA \|_{2 \to \infty}. \]
\[ \| \bA \|_{\infty, \infty} = \| \bA \|_{1 \to \infty}. \]
\[ \| \bA \|_{1 \to 1} = \| \bA^H \|_{1, \infty}. \]
\[ \| \bA \|_{1 \to 2} = \| \bA^H \|_{2, \infty} \]

Proof. The first three are straight forward applications of Theorem 4.153. The next two are applications of Theorem 4.154. See also Table 4.1.

4.12.11. Block Diagonally Dominant Matrices and Generalized Gershgorin Circle Theorem#

In [35] the idea of diagonally dominant matrices (see Diagonally Dominant Matrices) has been generalized to block matrices using matrix norms. We consider the specific case with spectral norm.

Definition 4.149 (Block diagonally dominant matrix)

Let \(\bA\) be a square matrix in \(\CC^{n \times n}\) which is partitioned in following manner

\[\begin{split} \bA = \begin{bmatrix} \bA_{11} & \bA_{12} & \dots & \bA_{1 k}\\ \bA_{21} & \bA_{22} & \dots & \bA_{2 k}\\ \vdots & \vdots & \ddots & \vdots\\ \bA_{k 1} & \bA_{k 2} & \dots & \bA_{k k}\\ \end{bmatrix} \end{split}\]

where each of the submatrices \(\bA_{i j}\) is a square matrix of size \(m \times m\). Thus \(n = k m \).

\(\bA\) is called block diagonally dominant if

\[ \| \bA_{ii}\|_2 \geq \sum_{j, j \neq i } \|\bA_{i j} \|_2. \]

holds true for all \(1 \leq i \leq k\).

If the inequality satisfies strictly for all \(i\), then \(\bA\) is called block strictly diagonally dominant matrix.

Theorem 4.157 (Nonsingularity of block strictly diagonally dominant matrices)

If the partitioned matrix \(\bA\) of Definition 4.149 is block strictly diagonally dominant matrix, then it is nonsingular.

For proof see [35].

This leads to the generalized Gershgorin disc theorem.

Theorem 4.158 (Generalized Gershgorin disc theorem)

Let \(\bA\) be a square matrix in \(\CC^{n \times n}\) which is partitioned in following manner

\[\begin{split} \bA = \begin{bmatrix} \bA_{11} & \bA_{12} & \dots & \bA_{1 k}\\ \bA_{21} & \bA_{22} & \dots & \bA_{2 k}\\ \vdots & \vdots & \ddots & \vdots\\ \bA_{k 1} & \bA_{k 2} & \dots & \bA_{k k}\\ \end{bmatrix} \end{split}\]

where each of the submatrices \(\bA_{i j}\) is a square matrix of size \(m \times m\).

Then each eigenvalue \(\lambda\) of \(\bA\) satisfies

\[ \| \lambda \bI - \bA_{ii}\|_2 \leq \sum_{j, j\neq i} \|\bA_{ij} \|_2 \text{ for some } i \in \{1,2, \dots, k\}. \]

For proof see [35].

Since the \(2\)-norm of a positive semidefinite matrix is nothing but its largest eigen value, the theorem directly leads to the following result.

Corollary 4.29 (\(2\) norm of a Hermitian positive semidefinite matrix)

Let \(\bA\) be a Hermitian positive semidefinite matrix. Let \(\bA\) be partitioned as in Theorem 4.158. Then its \(2\)-norm \(\| \bA \|_2\) satisfies

\[ \left | \| \bA \|_2 - \| \bA_{ii}\|_2 \right | \leq \sum_{j, j\neq i} \|\bA_{i j} \|_2 \text{ for some } i \in \{1,2, \dots, k \}. \]