7 minute read

Norm

Norm of vector

Norm is an operator that gives elements a kind of ‘length’ or ‘size’. Mathematically, norm is defined as

$\mathbf{Definition.}$ Norm
An operator $|| \cdot || : \mathbb{C}^m \to \mathbb{R}$ is a norm if it satisfies:
1) $|| \mathbf{x} || \geq 0; || \mathbf{x} || = 0 \leftrightarrow \mathbf{x} = \mathbf{0}$
2) $|| \mathbf{x} + \mathbf{y} || \leq || \mathbf{x} || + || \mathbf{y} ||$.
3) $|| \alpha \mathbf{x} || = | \alpha | || \mathbf{x} ||$.

Thus, if any operator statisfies the above 3 axioms, we can call it norm. In practice, various kinds of norm are used.

$L_p$ norm

$L_p$ norm is defined as

\[\begin{align*} \| \mathbf{x} \|_p = (\sum_{i = 1}^m | x_i |^p)^{\frac{1}{p}} \end{align*}\]

The most commonly used $L_p$ norm are $L_1$, $L_2$, and $L_\infty$ norm.

image


$p = 1$

$L_1$ norm is also called Manhattan norm or Taxicab norm:

\[\begin{align*} \| \mathbf{x} \|_1 = \sum_{i = 1}^m | x_i | \end{align*}\]

$p = 2$

When $p=2$, we call it $L_2$ norm or Euclidean norm:

\[\begin{align*} \| \mathbf{x} \|_2 = \sqrt{\sum_{i = 1}^m | x_i |^2} \end{align*}\]

$p = \infty$

Also, $p$ may have an infinite value. When $p = \infty$, we call it $L_\infty$ norm or supremum norm:

\[\begin{align*} \| \mathbf{x} \|_\infty &= \displaystyle \lim_{p \to \infty} (\sum_{i = 1}^m | x_i |^p)^{\frac{1}{p}} \\ &= \text{max } \{ |x_1|, \cdots, |x_n| \} \end{align*}\]

Weighted $p$-norm

With a variation of $L_p$ norm, weighted $\mathbf{p}$-norm is defined as follows:

\[\begin{align*} \| \mathbf{x} \|_{\mathbf{w}, p} &= \| W \mathbf{x} \|_p = (\sum_{i = 1}^m | w_i x_i |^p)^{\frac{1}{p}} \end{align*}\]

where $W = \text{diag}(\mathbf{w})$.


Norm of matrix

The previous definition of norm is an operator that receives the vector as input and outputs a positive number. However, can we extend it as an operator of the matrix?

(Induced) Matrix Norm

Using the norm of the vector, one can define the norm of the matrix as follows:

$\mathbf{Definition.}$ (Induced) Matrix Norm
Let $A \in \mathbb{C}^{m \times n}$ be given. A matrix norm induced by $|| \cdot ||$, $|| A ||$, is the smallest $C \geq 0$ such that $\forall \mathbf{x} \in \mathbb{C}^n$, $|| A \mathbf{x} || \leq C || \mathbf{x} ||$, i.e.,

\[\begin{align*} \| A \| &= \underset{\mathbf{x} \in \mathbb{C}^n, \mathbf{x} \neq \mathbf{0}}{\text{sup}}(\frac{\| A \mathbf{x} \|}{\| \mathbf{x} \|}) \\ &= \underset{\mathbf{x} \in \mathbb{C}^n, \| \mathbf{y} \| = 1}{\text{sup }} \| A \mathbf{y} \| \end{align*}\]

$\mathbf{Remark.}$ We may interpret $|| A ||$ as the maximum factor by which $A$ can stretch $\mathbf{x}$, so it’s also called spectral norm.

$L_p$ matrix norm

We may use $L_p$ norm for matrix norm.

$p = 1$

$L_1$ norm of matrix is defined as

\[\begin{align*} \| A \|_1 = \underset{\mathbf{x} \in \mathbb{C}^n, \| \mathbf{x} \| = 1}{\text{sup }} \| A \mathbf{x} \|_1 \end{align*}\]

$p = 2$

$L_2$ norm of matrix is defined as

\[\begin{align*} \| A \|_2 = \underset{\mathbf{x} \in \mathbb{C}^n, \| \mathbf{x} \| = 1}{\text{sup }} \| A \mathbf{x} \|_2 \end{align*}\]

$\mathbf{Remark.}$ If $A$ is a single row vector, $|| A ||_2$ corresponds to the vector norm:

\[\begin{align*} \| A \mathbf{x} \|_2 = | \mathbf{a}^\star \mathbf{x} | \leq \| \mathbf{a} \|_2 \cdot \| \mathbf{x} \|_2 \end{align*}\]

and bound is tight when $\mathbf{x} = \mathbf{a}$.

$p=\infty$

$L_\infty$ norm of matrix is defined as

\[\begin{align*} \| A \|_\infty = \underset{\mathbf{x} \in \mathbb{C}^n, \| \mathbf{x} \| = 1}{\text{sup }} \| A \mathbf{x} \|_\infty \end{align*}\]




Here are some useful facts to compute the $L_p$ matrix norm:

$\mathbf{Thm\ 1.}$ $L_1$ matrix norm equals to maximum absolute column sum

\[\begin{align*} \| A \|_1 = \underset{1 \leq j \leq n}{\text{max }} \| \mathbf{a}_j \|_1 \end{align*}\]
$\mathbf{Proof.}$

Let $\mathbf{a}_1, \cdots, \mathbf{a}_n$ be column vectors of a matrix $A$.

\[\begin{align*} \| A \mathbf{x} \|_1 &= \| \sum_{j = 1}^n x_j \mathbf{a}_j \|_1 \\ &\leq \sum_{j = 1}^n | x_j | \| \mathbf{a}_j \|_1 \\ &\leq \underset{1 \leq j \leq n}{\text{max }} \{ \| \mathbf{a}_j \|_1 \} \sum_{j = 1}^n | x_j | \\ &= \underset{1 \leq j \leq n}{\text{max }} \{ \| \mathbf{a}_j \|_1 \} \| \mathbf{x} \|_1 \\ \end{align*}\]

Note that the bound is tight. Done.





There is an special cases where $L_2$ matrix norm can be easily computed.

$\mathbf{Thm\ 2.}$ $L_2$ norm of outer product

\[\begin{align*} \| \mathbf{u} \times \mathbf{v} \|_2 = \| \mathbf{u} \|_2 \cdot \| \mathbf{v} \|_2 \end{align*}\]
$\mathbf{Proof.}$
\[\begin{align*} \| \mathbf{u} \mathbf{v}^\star \mathbf{x} \|_2 = \| \mathbf{u} \|_2 \cdot \left< \mathbf{v}, \mathbf{x} \right> \leq \| \mathbf{u} \|_2 \cdot \| \mathbf{v} \|_2 \cdot \| \mathbf{x} \|_2 \end{align*}\]

Since the equality is achieved when $\mathbf{x} = \mathbf{v}$, done.


$\mathbf{Thm\ 3.}$ $L_2$ norm of diagonal matrix $D \in \mathbb{C}^{m \times m}$

\[\begin{align*} \| D \|_2 = \underset{1 \leq i \leq m}{\text{max }} \{ | d_i | \} \end{align*}\]
$\mathbf{Proof.}$

Let $D = \text{diag}(d_1, \cdots, d_m) \in \mathbb{C}^{m \times m}$. Then,

\[\begin{align*} \| D \mathbf{x} \|_2^2 &= \sum_{i=1}^m | d_i x_i |^2 \\ &\leq \underset{1 \leq i \leq m}{\text{max }} \{ | d_i | \}^2 \sum_{i=1}^m | x_i |^2 \\ &= \underset{1 \leq i \leq m}{\text{max }} \{ | d_i | \}^2 \| \mathbf{x} \|_2^2 \end{align*}\]

Since the equality is tight, done.






$\mathbf{Thm\ 4.}$ $L_\infty$ matrix norm equals to maximum absolute row sum:

\[\begin{align*} \| A \|_\infty = \underset{1 \leq i \leq m}{\text{max }} \| \mathbf{a}_i^\star \|_1 \end{align*}\]
$\mathbf{Proof.}$

Let $\mathbf{a}_1^\star, \cdots, \mathbf{a}_m^\star$ be row vectors of $A$. Consider $|| \mathbf{x} || = 1$.

\[\begin{align*} \| A \mathbf{x} \|_\infty &= \text{max } \{ | \mathbf{a}_1^\star \mathbf{x} |, \cdots, | \mathbf{a}_m^\star \mathbf{x} | \} \\ &\leq \underset{1 \leq i \leq m}{\text{max }} \{ \sum_{j=1}^n | A_{ij} | |x_j| \} \\ &\leq \underset{1 \leq i \leq m}{\text{max }} |x_i| \cdot \underset{1 \leq i \leq m}{\text{max }} \{ \sum_{j=1}^n | A_{ij} | \} \\ &\leq \underset{1 \leq i \leq m}{\text{max }} \{ \sum_{j=1}^n | A_{ij} | \} \end{align*}\]

Since the equality is achievable, done.




General Matrix Norm

Indeed, matrix norms do not need to be induced by vector norms; it just suffices to satisfy 3 conditions of norm. The famous example of this case is Frobenius norm.

Frobenius norm

Frobenius norm, or Hilbert-Schmidt norm is defined as:

\[\begin{align*} \| A \|_F &= \sqrt{\sum_{i=1}^m \sum_{j=1}^m | a_{ij} |^2} \\ &= \sqrt{\text{tr}(A^\star A)} \\ &= \sqrt{\text{tr}(A A^\star)} \end{align*}\]




Significant Properties of Norm

Cauchy-Schwartz & Hölder Inequality

$\mathbf{Thm\ 5.}$ Hölder Inequality

For $\frac{1}{p} + \frac{1}{q} = 1 \; (1 \leq p, q \leq \infty)$, the following inequality holds:

\[\begin{align*} | \left< \mathbf{x}, \mathbf{y} \right> | \leq \| \mathbf{x} \|_p \cdot \| \mathbf{y} \|_q \end{align*}\]

and it is known as Hölder Inequality. In special case of $p = q = 2$, we call it Cauchy-Schwartz Inequality.

Bounding $|| AB ||$

$\mathbf{Thm\ 6.}$ Let any $A \in \mathbb{C}^{l \times m}, B \in \mathbb{C}^{m \times n}$. Then,

\[\begin{align*} \| AB \| \leq \| A \| \cdot \| B \|. \end{align*}\]
$\mathbf{Proof.}$

Proof is quite simple:

For any vector (with appropriate dimension) $\mathbf{x}$,

\[\begin{align*} \| AB \mathbf{x} \| \leq \| A \| \cdot \| B \mathbf{x} \| \leq \| A \| \cdot \| B \| \cdot \| \mathbf{x} \| \end{align*}\]


Note that it is not an equality, for example, $| | A^n | | \neq | | A | |^n (n \geq 2)$ generally.

$\mathbf{Remark.}$ We can bound Frobenius norm, too:

\[\begin{align*} \| AB \|_F^2 &= \sum_{i=1}^m \sum_{j=1}^n | (AB)_{ij} |^2 \\ &\leq \sum_{i=1}^m \sum_{j=1}^n ( \| \mathbf{a}_i \|_2 \cdot \| \mathbf{b}_j \|_2 )^2 \\ &= \sum_{i=1}^m ( \| \mathbf{a}_i \|_2)^2 \sum_{j=1}^n (\| \mathbf{b}_j \|_2 )^2 \\ &= \| A \|_F^2 \| B \|_F^2._\blacksquare \end{align*}\]



Unitary matrix preserves norm

$\mathbf{Thm\ 7.}$ For any $A \in \mathbb{C}^{m \times n}$ and unitary $Q \in \mathbb{C}^{m \times m}$,

\[\begin{align*} &\| QA \|_2 = \| AQ \|_2 = \| A \|_2 \\ &\| QA \|_F = \| AQ \|_F = \| A \|_F \end{align*}\]
$\mathbf{Proof.}$
\[\begin{align*} \| QA \mathbf{x} \|_2 = \sqrt{ (QA \mathbf{x})^\star QA \mathbf{x} } = \| A \mathbf{x} \|_2 \end{align*}\] \[\begin{align*} \| QA \|_F = \sqrt{\text{tr}(A^\star Q^\star QA)} = \sqrt{\text{tr}(A^\star A)} = \| A \|_F \end{align*}\] \[\begin{align*} \| AQ \| &= \underset{\| \mathbf{x} \| = 1}{\text{sup }} \| AQ \mathbf{x} \| \\ &= \underset{\mathbf{y} = Q \mathbf{x}, \| Q\mathbf{x} \| = 1}{\text{sup }} \| A \mathbf{y} \| \\ &= \underset{\| \mathbf{y} \| = 1}{\text{sup }} \| A \mathbf{y} \| \end{align*}\] \[\begin{align*} \| AQ \|_F = \| (AQ)^\star \|_F = \| A^\star \|_F = \| A \|_F \end{align*}\]


Note that it remains valid if $Q$ is generalized to a rectangular matrix with orthonormal columns.

Reference

[1] Lloyd N. Trefethen & David Bau III, Numerical Linear Algebra, 1997.

Leave a comment