Matrix Norms
Contents
4.12. Matrix Norms#
This section reviews various matrix norms on the vector space of complex matrices over the field of complex numbers \((\CC^{m \times n}, \CC)\).
We know \((\CC^{m \times n}, \CC)\) is a finite dimensional vector space with dimension \(m n\). We will usually refer to it as \(\CC^{m \times n}\).
Matrix norms will follow the usual definition of norms for a vector space.
(Matrix norm)
A function \(\| \cdot \| : \CC^{m \times n} \to \RR\) is called a matrix norm on \(\CC^{m \times n}\) if for all \(\bA, \bB \in \CC^{m \times n}\) and all \(\alpha \in \CC\) it satisfies the following
[Positivity]
\[ \| \bA \| \geq 0 \]with \(\| \bA \| = 0 \iff \bA = \ZERO\).
[Homogeneity]
\[ \| \alpha \bA \| = | \alpha | \| \bA \|. \][Triangle inequality]
\[ \| \bA + \bB \| \leq \| \bA \| + \| \bB \|. \]
We recall some of the standard results on finite dimensional normed vector spaces.
All matrix norms are equivalent. Let \(\| \cdot \|\) and \(\| \cdot \|'\) be two different matrix norms on \(\CC^{m \times n}\). Then there exist two constants \(a\) and \(b\) such that the following holds
A matrix norm is a continuous function \(\| \cdot \| : \CC^{m \times n} \to \RR\).
4.12.1. Norms like \(\ell_p\) on Complex Vector Space#
Following norms are quite like \(\ell_p\) norms on finite dimensional complex vector space \(\CC^n\). They are developed by the fact that the matrix vector space \(\CC^{m\times n}\) has one to one correspondence with the complex vector space \(\CC^{m n}\).
(Sum norm)
Let \(\bA \in \CC^{m\times n}\) and \(\bA = [a_{i j}]\).
Matrix sum norm is defined as
(Frobenius norm)
Let \(\bA \in \CC^{m\times n}\) and \(\bA = [a_{i j}]\).
Matrix Frobenius norm is defined as
(Max norm)
Let \(\bA \in \CC^{m\times n}\) and \(\bA = [a_{i j}]\).
Matrix Max norm is defined as
4.12.2. Properties of Frobenius Norm#
We now prove some elementary properties of Frobenius norm.
The Frobenius norm of a matrix is equal to the Frobenius norm of its Hermitian transpose.
Proof. This is valid since the norm involves only the absolute values of entries in the matrix.
Let
\[ \bA = [a_{i j}]. \]Then
\[ \bA^H = [\overline{a_{j i}}] \]Then
\[\begin{split} \| \bA^H \|_F^2 = \left ( \sum_{j=1}^n \sum_{i=1}^{m} | \overline{a_{i j}} |^2 \right ) = \left ( \sum_{i=1}^{m} \\ \sum_{j=1}^n | a_{i j} |^2 \right ) = \| \bA \|_F^2. \end{split}\]Now
\[ \| \bA^H \|_F^2 = \| \bA \|_F^2 \implies \| \bA^H \|_F = \| \bA \|_F. \]
(Expansion in squared norms of columns)
Let \(\bA \in \CC^{m \times n}\) be written as a row of column vectors
Then
Proof. We note that
Now
We thus showed that that the square of the Frobenius norm of a matrix is nothing but the sum of squares of \(\ell_2\) norms of its columns.
Let \(\bA \in \CC^{m \times n}\) be written as a column of row vectors
Then
Proof. We note that
Now
We now consider how the Frobenius norm is affected with the action of unitary matrices.
Let \(\bA\) be any arbitrary matrix in \(\CC^{m \times n}\).
Let \(\bU\) be some unitary matrices in \(\CC^{m \times m}\).
Let \(\bV\) be some unitary matrices in \(\CC^{n \times n}\).
We present our first result that multiplication with unitary matrices doesn’t change Frobenius norm of a matrix.
The Frobenius norm of a matrix is invariant to pre or post multiplication by a unitary matrix. i.e.
and
Proof. We can write \(\bA\) as
So
Then applying Lemma 4.87 clearly
But we know that unitary matrices are norm preserving. Hence
Thus
which implies
Similarly writing \(\bA\) as
we have
Then applying Lemma 4.88 clearly
But we know that unitary matrices are norm preserving. Hence
Thus
which implies
An alternative approach for the 2nd part of the proof using the first part is just one line
In above we use Lemma 4.86 and the fact that \(\bV\) is a unitary matrix implies that \(\bV^H\) is also a unitary matrix. We have already shown that pre multiplication by a unitary matrix preserves Frobenius norm.
(Consistency of Frobenius norm)
Let \(\bA \in \CC^{m \times n}\) and \(\bB \in \CC^{n \times p}\) be two matrices. Then the Frobenius norm of their product is less than or equal to the product of Frobenius norms of the matrices themselves. i.e.
Proof. We can write \(\bA\) as
where \(\ba_i\) are \(m\) column vectors corresponding to rows of \(\bA\). Similarly we can write \(\bB\) as
where \(\bb_i\) are column vectors corresponding to columns of \(\bB\).
Then
Now looking carefully
Applying the Cauchy-Schwartz inequality we have
Now
which implies
by taking square roots on both sides.
Let \(\bA \in \CC^{m \times n}\) and let \(\bx \in \CC^n\). Then
Proof. We note that Frobenius norm for a column matrix is same as \(\ell_2\) norm for corresponding column vector. i.e.
Now applying Theorem 4.142 we have
It turns out that Frobenius norm is intimately related to the singular value decomposition of a matrix.
(Frobenius norm and singular values)
Let \(\bA \in \CC^{m \times n}\). Let the singular value decomposition of \(\bA\) be given by
Let the singular values of \(\bA\) be \(\sigma_1, \dots, \sigma_k\) where \(k = \min(m, n)\). Then
Proof. This comes from the invariance of Frobenius norm with unitary transformations.
But
since \(\bU\) and \(\bV\) are unitary matrices (see Theorem 4.141 ). Now the only nonzero terms in \(\Sigma\) are the singular values. Hence
4.12.3. Consistency of a Matrix Norm#
(Consistent matrix norm)
A matrix norm \(\| \cdot \|\) is called consistent on \(\CC^{n \times n}\) if
holds true for all \(\bA, \bB \in \CC^{n \times n}\). A matrix norm \(\| \cdot \|\) is called consistent if it is defined on \(\CC^{m \times n}\) for all \(m, n \in \Nat\) and (4.19) holds for all matrices \(\bA, \bB\) for which the product \(\bA \bB\) is defined.
A consistent matrix norm is also known as a sub-multiplicative norm.
With this definition and results in Theorem 4.142 we can see that Frobenius norm is consistent.
4.12.4. Subordinate Matrix Norm#
A matrix operates on vectors from one space to generate vectors in another space. It is interesting to explore the connection between the norm of a matrix and norms of vectors in the domain and co-domain of a matrix.
Let \(m, n \in \Nat\) be given. Let \(\| \cdot \|_{\alpha}\) be some norm on \(\CC^m\) and \(\| \cdot \|_{\beta}\) be some norm on \(\CC^n\). Let \(\| \cdot \|\) be some norm on matrices in \(\CC^{m \times n}\). We say that \(\| \cdot \|\) is subordinate to the vector norms \(\| \cdot \|_{\alpha}\) and \(\| \cdot \|_{\beta}\) if
for all \(\bA \in \CC^{m \times n}\) and for all \(\bx \in \CC^n\). In other words the length of the vector doesn’t increase by the operation of \(\bA\) beyond a factor given by the norm of the matrix itself.
If \(\| \cdot \|_{\alpha}\) and \(\| \cdot \|_{\beta}\) are same then we say that \(\| \cdot \|\) is subordinate to the vector norm \(\| \cdot \|_{\alpha}\).
We have shown earlier in Corollary 4.28 that Frobenius norm is subordinate to Euclidean norm.
4.12.5. Operator Norm#
We now consider the maximum factor by which a matrix \(\bA\) can increase the length of a vector.
(Operator norm)
Let \(m, n \in \Nat\) be given. Let \(\| \cdot \|_{\alpha}\) be some norm on \(\CC^n\) and \(\| \cdot \|_{\beta}\) be some norm on \(\CC^m\). For \(\bA \in \CC^{m \times n}\) we define
The term \(\frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}}\) represents the factor with which the length of \(\bx\) increased by operation of \(\bA\). We simply pick up the maximum value of such scaling factors over all nonzero \(\bx\).
The norm as defined above is known as \((\alpha \to \beta)\) operator norm, the \((\alpha \to \beta)\)-norm, or simply the \(\alpha\)-norm if \(\alpha = \beta\).
We need to verify that this definition satisfies all properties of a norm.
Nonnegativity
Clearly if \(\bA = \ZERO\) then \(\bA \bx = \bzero\) always, hence \(\| \bA \| = 0\).
Conversely, if \(\| \bA \| = 0\) then \(\| \bA \bx \|_{\beta} = 0 \Forall \bx \in \CC^n\).
Hence \(\bA \bx = \bzero\) for every \(\bx \in \CC^n\).
In particular this is true for the unit vectors \(\be_i \in \CC^n\).
The \(i\)-th column of \(\bA\) is given by \(\bA \be_i\) which is \(\bzero\).
Thus each column in \(\bA\) is \(\bzero\).
Hence \(\bA = \ZERO\).
Homogeneity
Consider \(c \in \CC\).
Then
\[ \| c \bA \| = \underset{\bx \neq 0}{\max } \frac{\| c \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}} = | c | \underset{\bx \neq 0}{\max } \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}} = | c | \| \bA \|. \]
We now present some useful observations on operator norm before we can prove triangle inequality for operator norm.
For any \(\bx \in \Kernel(\bA)\), \(\bA \bx = \bzero\) hence we only need to consider vectors which don’t belong to the kernel of \(\bA\).
Thus we can write
\[ \| \bA \|_{\alpha \to \beta} = \underset{\bx \notin \Kernel(\bA)} {\max } \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}}. \]We also note that
\[ \frac{\| \bA c \bx \|_{\beta}}{\| c \bx \|_{\alpha}} = \frac{| c | \| \bA \bx \|_{\beta}}{ | c | \| \bx \|_{\alpha}} = \frac{\| \bA \bx \|_{\beta}}{\| \bx \|_{\alpha}} \Forall c \neq 0, \bx \neq \bzero. \]Thus, it is sufficient to find the maximum on unit norm vectors:
\[ \| \bA \|_{\alpha \to \beta} = \underset{\| \bx \|_{\alpha} = 1} {\max } \| \bA \bx \|_{\beta}. \]Note that since \(\|\bx \|_{\alpha} = 1\) hence the term in denominator goes away.
(Operator norm is subordinate)
The \((\alpha \to \beta)\)-operator norm is subordinate to vector norms \(\| \cdot \|_{\alpha}\) and \(\| \cdot \|_{\beta}\). i.e.
Proof. For \(\bx = \bzero\) the inequality is trivially satisfied. For \(\bx \neq \bzero\), by definition, we have
(Existence of the maximizer for operator norm)
There exists a vector \(\bx^* \in \CC^{n}\) with unit norm (\(\| \bx^* \|_{\alpha} = 1\)) such that
Proof. Recall that
The norm function \(\| \cdot \|_{\beta}\) is continuous on \(\CC^m\).
The mapping \(\bx \mapsto \bA \bx\) is continuous.
Hence the function \(\bx \mapsto \| \bA \bx \|_{\beta}\) is continuous.
The set \(\{ \bx \in \CC^n \ST \| \bx \|_{\alpha} = 1\}\) is compact.
Hence, the function attains a supremum value over this set.
We are now ready to prove triangle inequality for operator norm.
(Triangle inequality for operator norm)
Operator norm as defined in Definition 4.145 satisfies triangle inequality.
Proof. Let \(\bA\) and \(\bB\) be some matrices in \(\CC^{m \times n}\).
Consider the operator norm of matrix \(\bA + \bB\).
From previous remarks, there exists some vector \(\bx^* \in \CC^n\) with \(\| \bx^* \|_{\alpha} = 1\) such that
\[ \| \bA + \bB \| = \| (\bA + \bB) \bx^* \|_{\beta}. \]Now
\[ \| (\bA + \bB) \bx^* \|_{\beta} = \| \bA \bx^* + \bB \bx^* \|_{\beta} \leq \| \bA \bx^*\|_{\beta} + \| \bB \bx^*\|_{\beta}. \]From subordinate property of operator norm, we have
\[ \| \bA \bx^*\|_{\beta} \leq \| \bA \| \|\bx^*\|_{\alpha} = \| \bA \| \]and
\[ \| \bB \bx^*\|_{\beta} \leq \| \bB \| \|\bx^*\|_{\alpha} = \|B \| \]since \(\| \bx^* \|_{\alpha} = 1\).
Hence we have
\[ \| \bA + \bB \| \leq \| \bA \| + \| \bB \|. \]
It turns out that operator norm is also consistent under certain conditions.
(Consistency of operator norm)
Let \(\| \cdot \|_{\alpha}\) be defined over all \(m \in \Nat\). Let \(\| \cdot \|_{\beta} = \| \cdot \|_{\alpha}\). Then the operator norm
is consistent.
Proof. We need to show that
Now
\[ \| \bA \bB \|_{\alpha} = \underset{\bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]We note that if \(\bB \bx = \bzero\), then \(\bA \bB \bx = \bzero\).
Hence we can rewrite as
\[ \| \bA \bB \|_{\alpha} = \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]Now if \(\bB \bx \neq \bzero\) then \( \| \bB \bx \|_{\alpha} \neq 0\).
Hence
\[ \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}} = \frac{\| \bA \bB \bx \|_{\alpha}}{\|\bB \bx \|_{\alpha}} \frac{\| \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}} \]and
\[ \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}} \leq \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\|\bB \bx \|_{\alpha}} \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]Clearly
\[ \| \bB \|_{\alpha} = \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bB \bx \|_{\alpha}}{\| \bx \|_{\alpha}}. \]Furthermore
\[ \underset{\bB \bx \neq \bzero}{\max } \frac{\| \bA \bB \bx \|_{\alpha}}{\|\bB \bx \|_{\alpha}} \leq \underset{\by \neq \bzero}{\max } \frac{\| \bA \by \|_{\alpha}}{\| \by \|_{\alpha}} = \|\bA \|_{\alpha}. \]Thus we have
\[ \| \bA \bB \|_{\alpha} \leq \| \bA \|_{\alpha} \| \bB \|_{\alpha}. \]
4.12.6. \(p\)-norm for Matrices#
We recall the definition of \(\ell_p\) norms for vectors \(\bx \in \CC^n\)
The operator norms \(\| \cdot \|_p\) defined from \(\ell_p\) vector norms are of specific interest.
\(p\)-norm)
(MatrixThe \(p\)-norm for a matrix \(\bA \in \CC^{m \times n}\) is defined as
where \(\| \bx \|_p\) is the standard \(\ell_p\) norm for vectors in \(\CC^m\) and \(\CC^n\).
As per Lemma 4.92 \(p\)-norms for matrices are consistent norms. They are also sub-ordinate to \(\ell_p\) vector norms.
Special cases are considered for \(p=1,2\) and \(\infty\).
(Max column sum, max row sum, spectral norms)
Let \(\bA \in \CC^{m \times n}\).
For \(p=1\) we have
This is also known as max column sum norm.
For \(p=\infty\) we have
This is also known as max row sum norm.
Finally for \(p=2\) we have
where \(\sigma_1\) is the largest singular value of \(\bA\). This is also known as spectral norm.
Proof. Max column sum norm:
Let
\[ \bA = \begin{bmatrix} \ba^1 & \dots, & \ba^n \end{bmatrix}. \]Then
\[\begin{split} \begin{aligned} \| \bA \bx \|_1 &= \left \| \sum_{j=1}^n x_j \ba^j \right \|_1 \\ &\leq \sum_{j=1}^n \left \| x_j \ba^j \right \|_1 \\ &= \sum_{j=1}^n |x_j| \left \| \ba^j \right \|_1 \\ &\leq \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1 \sum_{j=1}^n |x_j| \\ &= \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1 \| \bx \|_1. \end{aligned} \end{split}\]Thus,
\[ \| \bA \|_1 = \underset{\bx \neq \bzero}{\max } \frac{\| \bA \bx \|_1}{\| \bx \|_1} \leq \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1 \]which the maximum column sum.
We need to show that this upper bound is indeed an equality.
Indeed for any \(\bx=\be_j\) where \(\be_j\) is a unit vector with \(1\) in \(j\)-th entry and 0 elsewhere,
\[ \| \bA \be_j \|_1 = \| \ba^j \|_1. \]Thus
\[ \| \bA \|_1 \geq \| \ba^j \|_1 \quad \Forall 1 \leq j \leq n. \]Combining the two, we see that
\[ \| \bA \|_1 = \underset{1 \leq j \leq n}{\max}\| \ba^j \|_1. \]
Max row sum norm:
For \(p=\infty\), we proceed as follows.
\[\begin{split} \begin{aligned} \| \bA \bx \|_{\infty} &= \underset{1 \leq i \leq m}{\max} \left | \sum_{j=1}^n a_{i j } x_j \right | \\ & \leq \underset{1 \leq i \leq m}{\max} \sum_{j=1}^n | a_{i j } | | x_j |\\ & \leq \underset{1 \leq j \leq n}{\max} | x_j | \underset{1 \leq i \leq m}{\max} \sum_{j=1}^n | a_{i j } |\\ &= \| \bx \|_{\infty} \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1 \end{aligned} \end{split}\]where \(\underline{\ba}^i\) are the rows of \(\bA\).
This shows that
\[ \| \bA \bx \|_{\infty} \leq \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1. \]We need to show that this is indeed an equality.
Fix an \(i = k\) and choose \(\bx\) such that
\[ x_j = \sgn (a_{k j}). \]Clearly \(\| \bx \|_{\infty} = 1\).
Then
\[\begin{split} \begin{aligned} \| \bA \bx \|_{\infty} &= \underset{1 \leq i \leq m}{\max} \left | \sum_{j=1}^n a_{i j } x_j \right | \\ &\geq \left | \sum_{j=1}^n a_{k j } x_j \right | \\ &= \left | \sum_{j=1}^n | a_{k j } | \right | \\ &= \sum_{j=1}^n | a_{k j } |\\ &= \| \underline{\ba}^k \|_1. \end{aligned} \end{split}\]Thus,
\[ \| \bA \|_{\infty} \geq \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1. \]Combining the two inequalities we get:
\[ \| \bA \|_{\infty} = \underset{1 \leq i \leq m}{\max}\| \underline{\ba}^i \|_1. \]
Spectral norm:
Remaining case is for \(p=2\).
For any vector \(\bx\) with \(\| \bx \|_2 = 1\),
\[ \| \bA \bx \|_2 = \| \bU \Sigma \bV^H \bx \|_2 = \| \bU (\Sigma \bV^H \bx )\|_2 = \| \Sigma \bV^H \bx \|_2 \]since \(\ell_2\) norm is invariant to unitary transformations.
Let \(\bv = \bV^H \bx\).
Then \(\|\bv\|_2 = \| \bV^H \bx \|_2 = \| \bx \|_2 = 1\).
Let \(k = \min(m, n)\).
Now
\[\begin{split} \| \bA \bx \|_2 &= \| \Sigma \bv \|_2\\ &= \left ( \sum_{j=1}^k | \sigma_j v_j |^2 \right )^{\frac{1}{2}}\\ &\leq \sigma_1 \left ( \sum_{j=1}^k | v_j |^2 \right )^{\frac{1}{2}}\\ &= \sigma_1 \| \bv \|_2 = \sigma_1. \end{split}\]This shows that
\[ \| \bA \|_2 \leq \sigma_1. \]Now consider some vector \(\bx\) such that \(\bv = (1, 0, \dots, 0)\).
Then
\[ \| \bA \bx \|_2 = \| \Sigma \bv \|_2 = \sigma_1. \]Thus
\[ \| \bA \|_2 \geq \sigma_1. \]Combining the two, we get that \(\| \bA \|_2 = \sigma_1\).
4.12.7. The \(2\)-norm#
Let \(\bA \in \CC^{n \times n}\) have singular values \(\sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_n\). Let the eigen values for \(\bA\) be \(\lambda_1, \lambda_2, \dots, \lambda_n\) with \(|\lambda_1| \geq |\lambda_2| \geq \dots \geq |\lambda_n|\). Then the following hold
and if \(\bA\) is nonsingular, then
If \(\bA\) is symmetric and positive definite, then
and if \(\bA\) is nonsingular, then
If \(\bA\) is normal then
and if \(\bA\) is nonsingular, then
4.12.8. Unitary Invariant Norms#
(Unitary invariant norm)
A matrix norm \(\| \cdot \|\) on \(\CC^{m \times n}\) is called unitary invariant if \(\| \bU \bA \bV \| = \| \bA \|\) for any \(\bA \in \CC^{m \times n}\) and any unitary matrices \(\bU \in \CC^{m \times m}\) and \(\bV \in \CC^{n \times n}\).
We have already seen in Theorem 4.141 that Frobenius norm is unitary invariant.
It turns out that spectral norm is also unitary invariant.
4.12.9. More Properties of Operator Norms#
In this section we will focus on operator norms connecting normed linear spaces \((\CC^n, \| \cdot \|_{p})\) and \((\CC^m, \| \cdot \|_{q})\). Typical values of \(p, q\) would be in \(\{1, 2, \infty\}\).
We recall that
The table below [78] shows how to compute different \((p, q)\) norms. Some can be computed easily while others are NP-hard to compute.
\(p\) |
\(q\) |
\(\| \bA \|_{p \to q}\) |
Calculation |
---|---|---|---|
1 |
1 |
\(\| \bA \|_{1 }\) |
Maximum \(l_1\) norm of a column |
1 |
2 |
\(\| \bA \|_{1 \to 2}\) |
Maximum \(l_2\) norm of a column |
1 |
\(\infty\) |
\(\| \bA \|_{1 \to \infty}\) |
Maximum absolute entry of a matrix |
2 |
1 |
\(\| \bA \|_{2 \to 1}\) |
NP hard |
2 |
2 |
\(\| \bA \|_{2}\) |
Maximum singular value |
2 |
\(\infty\) |
\(\| \bA \|_{2 \to \infty}\) |
Maximum \(l_2\) norm of a row |
\(\infty\) |
1 |
\(\| \bA \|_{\infty \to 1}\) |
NP hard |
\(\infty\) |
2 |
\(\| \bA \|_{\infty \to 2}\) |
NP hard |
\(\infty\) |
\(\infty\) |
\(\| \bA \|_{\infty}\) |
Maximum \(l_1\)-norm of a row |
The topological dual of the finite dimensional normed linear space \((\CC^n, \| \cdot \|_{p})\) is the normed linear space \((\CC^n, \| \cdot \|_{p'})\) where
\(\ell_2\)-norm is dual of \(l_2\)-norm. It is a self dual.
\(\ell_1\) norm and \(\ell_{\infty}\)-norm are dual of each other.
When a matrix \(\bA\) maps from the space \((\CC^n, \| \cdot \|_{p})\) to the space \((\CC^m, \| \cdot \|_{q})\), we can view its conjugate transpose \(\bA^H\) as a mapping from the space \((\CC^m, \| \cdot \|_{q'})\) to \((\CC^n, \| \cdot \|_{p'})\).
(Operator norm and conjugate transpose)
Operator norm of a matrix always equals the operator norm of its conjugate transpose. In other words
where
Specific applications of this result are:
This is obvious since the maximum singular value of a matrix and its conjugate transpose are same.
This is also obvious since max column sum of \(\bA\) is same as the max row sum norm of \(\bA^H\) and vice versa.
We now need to show the result for the general case (arbitrary \(1 \leq p, q \leq \infty\)).
Proof. TODO
\(1 \to p\) norm)
(where
Proof. Expanding:
Thus,
We need to show that this upper bound is indeed an equality.
Indeed for any \(\bx=\be_j\) where \(\be_j\) is a unit vector with \(1\) in \(j\)-th entry and 0 elsewhere,
Thus
Combining the two, we see that
\(p \to \infty\) norm)
(where
Proof. Using Theorem 4.146, we get
Using Theorem 4.147, we get
This completes the proof.
\(p \to q\) norms)
(Consistency ofFor two matrices \(\bA\) and \(\bB\) with \(p,q,s \geq 1\), we have
Proof. We start with
From Lemma 4.90, we obtain
Thus,
\(p \to \infty\) norms)
(Consistency ofFor two matrices \(\bA\) and \(\bB\) and \(p \geq 1\), we have
Proof. We start with
From Lemma 4.90, we obtain
Thus,
\(p\)-norm on \(p \to \infty\) norm)
(Dominance ofIn particular
Also:
Proof. Choosing \(q = \infty\) and \(s = p\) and applying Theorem 4.149
But \(\| \bI \|_{p \to \infty}\) is the maximum \(\ell_p\) norm of any row of \(\bI\) which is \(1\). Thus
Consider the expression
The constraint \(\bz \in \ColSpace(\bA^H), \bz \neq \bzero\) means there exists some vector \(\bu \notin \Kernel(\bA^H)\) such that \(\bz = \bA^H \bu\).
This expression measures the factor by which the non-singular part of \(\bA\) can change the length of a vector.
The following bound holds for every matrix \(\bA\):
If \(\bA\) is surjective (onto), then the equality holds. When \(\bA\) is bijective (one-one onto, square, invertible), then the result implies
Proof. The spaces \(\ColSpace(\bA^H)\) and \(\ColSpace(\bA)\) have same dimensions given by \(\Rank(\bA)\).
We recall that \(\bA^{\dag} \bA\) is a projector onto the column space of \(\bA^H\).
\[ \bw = \bA \bz \iff \bz = \bA^{\dag} \bw = \bA^{\dag} \bA \bz \Forall \bz \in \ColSpace (\bA^H). \]As a result we can write
\[ \frac{\| \bz \|_p}{ \| \bA \bz \|_q} = \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} \]whenever \(\bz \in \ColSpace(\bA^H)\) and \(\bz \neq \bzero\).
Now
\[\begin{split} \left [ \underset{\substack{\bz \in \ColSpace(\bA^H)\\ \bz \neq \bzero}}{\min} \frac{\| \bA \bz \|_q}{\| \bz \|_p}\right ]^{-1} = \underset{\substack{\bz \in \ColSpace(\bA^H)\\ \bz \neq \bzero}}{\max} \frac{\| \bz \|_p}{ \| \bA \bz \|_q} = \underset{\substack{\bw \in \ColSpace(\bA) \\ \bw \neq \bzero}}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} \leq \underset{\bw \neq \bzero}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q}. \end{split}\]When \(\bA\) is surjective, then \(\ColSpace(\bA) = \CC^m\).
Hence
\[\begin{split} \underset{\substack{\bw \in \ColSpace(\bA)\\ \bw \neq \bzero}}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} = \underset{\bw \neq \bzero}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q}. \end{split}\]Thus, the inequality changes into equality.
Finally
\[ \underset{\bw \neq \bzero}{\max} \frac{\| \bA^{\dag} \bw \|_p}{ \| \bw \|_q} = \| \bA^{\dag} \|_{q \to p} \]which completes the proof.
4.12.10. Row Column Norms#
A common way of measuring the norm of a matrix is to compute the \(\ell_p\) norm for each row and then compute the \(\ell_q\) norm of the resultant column vector. Such norms are known as row column norms.
If \(p=q=2\), then the resultant norm is same as the Frobenius norm.
If \(p=q=1\), then the resultant norm is same as the sum norm.
If \(p=q=\infty\), then the resultant norm is same as the max norm.
Thus, one can think of the row column norms as a generalization of these norms.
(Row-column norms)
Let \(\bA\) be an \(m\times n\) matrix with rows \(\underline{\ba}^i\) as
Then we define
where \(1 \leq p < \infty\). In other words, we take \(\ell_p\)-norms of all row vectors and then find the maximum.
We define
This is equivalent to taking \(\ell_{\infty}\) norm on each row and then taking the maximum of all the norms.
For \(1 \leq p , q < \infty\), we define the norm
In other words, we compute \(\ell_p\)-norm of all the row vectors to form another vector and then take \(\ell_q\)-norm of that vector.
Note that the norm \(\| \bA \|_{p, \infty}\) is different from the operator norm \(\| \bA \|_{p \to \infty}\). Similarly \(\| \bA \|_{p, q}\) is different from \(\| \bA \|_{p \to q}\).
where
Proof. From Theorem 4.148 we get
This is exactly the definition of \(\| \bA \|_{p, \infty}\).
Proof. We note that:
From Theorem 4.153
For any two matrices \(\bA, \bB\), we have
Proof. Let \(q\) be such that \(\frac{1}{p} + \frac{1}{q} = 1\).
From Theorem 4.150, we have
\[ \| \bA \bB \|_{q \to \infty} \leq \| \bA \|_{\infty \to \infty} \| \bB \|_{q \to \infty}. \]From Theorem 4.153
\[ \| \bA \bB \|_{q \to \infty} = \| \bA \bB\|_{p, \infty} \]and
\[ \| \bB \|_{q \to \infty} = \| \bB \|_{p, \infty}. \]Thus
\[ \| \bA \bB\|_{p, \infty} \leq \| \bA \|_{\infty \to \infty} \| \bB \|_{p, \infty}. \]
\((p,q)\) and \((p \to q)\) norms)
(Relations betweenRelations between \((p, q)\) norms and \((p \to q)\) norms
Proof. The first three are straight forward applications of Theorem 4.153. The next two are applications of Theorem 4.154. See also Table 4.1.
4.12.11. Block Diagonally Dominant Matrices and Generalized Gershgorin Circle Theorem#
In [35] the idea of diagonally dominant matrices (see Diagonally Dominant Matrices) has been generalized to block matrices using matrix norms. We consider the specific case with spectral norm.
(Block diagonally dominant matrix)
Let \(\bA\) be a square matrix in \(\CC^{n \times n}\) which is partitioned in following manner
where each of the submatrices \(\bA_{i j}\) is a square matrix of size \(m \times m\). Thus \(n = k m \).
\(\bA\) is called block diagonally dominant if
holds true for all \(1 \leq i \leq k\).
If the inequality satisfies strictly for all \(i\), then \(\bA\) is called block strictly diagonally dominant matrix.
(Nonsingularity of block strictly diagonally dominant matrices)
If the partitioned matrix \(\bA\) of Definition 4.149 is block strictly diagonally dominant matrix, then it is nonsingular.
For proof see [35].
This leads to the generalized Gershgorin disc theorem.
(Generalized Gershgorin disc theorem)
Let \(\bA\) be a square matrix in \(\CC^{n \times n}\) which is partitioned in following manner
where each of the submatrices \(\bA_{i j}\) is a square matrix of size \(m \times m\).
Then each eigenvalue \(\lambda\) of \(\bA\) satisfies
For proof see [35].
Since the \(2\)-norm of a positive semidefinite matrix is nothing but its largest eigen value, the theorem directly leads to the following result.
\(2\) norm of a Hermitian positive semidefinite matrix)
(Let \(\bA\) be a Hermitian positive semidefinite matrix. Let \(\bA\) be partitioned as in Theorem 4.158. Then its \(2\)-norm \(\| \bA \|_2\) satisfies