Sunday, 16 June 2024

Use of geometric sum as inverse in function spaces

When I teach first year linear algebra, one of the main show is the use of diagonalization to simplify matrix computation then to link that with various applications. This is the real point where you feel the real use of these concepts other than making up its geometric meaning. Cayley-Hamilton, characteristic polynomial for recurrence sequences, Markov chain and so on.

To start with, consider a matrix $A$. If $\sum A^k = I + A + A^2 + \ldots $ is well-defined (aka converges) then by geometric series we know it is equal to $(I-A)^{-1}$. If we do the other way round we can compute the geometric series of $I-A$, we will recover the inverse $(I-(I-A))^{-1} = A^{-1}$. The problem is...does the geometric sum converges? This is where the eigenvalues kick in.

For diagonal matrix $D$, $\sum D^k$ converges iff all diagonal entries (aka eigenvalues) are of norm less than or equal 1. If $A$ is diagonalizable we can write $A = PDP^{-1}$, then $(I-A) = P(I-D)P^{-1}$, so we want the diagonal entries of $(I-D)$ to be in $(-1,1)$. Or rather, we want all eigenvalues of $A$ to be in $(0,2)$. But this is a very limited result. Can we do better? Yes!

Notice that we can tweek the eigenvalues by multiplying a constant. Let $c \in \mathbb{R}$ be non-zero so that all eigenvalues of $A/c$ to be in $(0,2)$, then the same procedure may proceed. 

Consider the matrix $A = \begin{bmatrix}5&4\\2&3\end{bmatrix}$ with eigenvalues 1 and 7, we take $c = 8$. Write $A = PDP^{-1}$ then $I - D/c$ has diagonal entries 7/8 and 1/8. Geometric sum gives $\sum (I-D/c)^k = \begin{bmatrix}8&0\\0&8/7\end{bmatrix}$. 

Plugging back we have $P(\sum (I-D/c)^k)P^{-1} = \frac{8}{7}\begin{bmatrix}3&-4\\-2&5\end{bmatrix}$, which is precisely $cA^{-1}$. Don't forget that $(A/c)^{-1} =cA^{-1}\neq A^{-1}/c$!

*

Now let us forget about linear algebra and just the a step forward. How about using geometric sum in function spaces? 

We start with a very vague idea. Let $M$ be the integral operator, then we obviously obtain $(I-M)[\exp] = -C$ where $C$ arises from the integral. What if $(I-M)$ has inverse $(I+M+M^2+...)$? Let's say $M[-C] = -Cx + C_1$, then $M^2[-C] = -Cx^2/2 + C_1x + C_2$, and so on. One obtains $(\sum M^k)[-C] = (-C+\sum C_i)(1 + x + x^2/2 + \ldots)$, and checking $x = 0$ gives $-C + \sum C_i = 1$. This is such a lucid dream because we just proved the series $e^x = \sum x^k/k!$ without differentiation at all!!

But wait, there are too many technical problems in the above argument to the point if I read that from my student's assignment I would have torn that paper into pieces. The gaps are not fatal, but we have to be very, very careful.

The biggest problem of all, is that integral operator in that arbitrary form, is likely unbounded. We must sacrifice our dream of working over the reals and work in $[0,N]$ instead. Since $N$ is arbitrary we can extend that basically to the reals, at least pointwisely.

In a bounded interval $L^p$ spaces work similarly but the best is $L^{\infty}$ because we immediately give the norm of the integral operator. Define $M_0:L^{\infty}([0,N]) \to L^{\infty}([0,N])$ by $M_0f(x) = \int ^x_0f(u)du$. This is well defined because $M_0f(x) \leq N\|f\|_{\infty}$, and it also implies that $\|M_0\| = N$ (the equality is obvious by taking $f = 1$). 

We solved unboundedness but we need $\| M_0 \|<1$ for that to work. What should we do now? Well, just copy what we did to matrices. Define $M = M_0/(N+1)$ and try again: $M[e^x] = (e^x-1)/(N+1)$ but $I[e^x] = e^x$...they don't add up. It turns out that tweeking $M$ won't suffice because a tweek to the function is necessary as well. 

Let $f = e^{x/(N+1)}$, then $Mf = (e^{x/(N+1)} - 1)$ and so $(I-M)f = 1$. Since $\|M\| < 1$, $\sum M^k$ is a well-defined operator and is the inverse of $(I-M)$. Therefore $f = (\sum M^k)[1]$. Induction gives $M^k[1](x) = (x/(N+1))^k/k!$, so $e^{x/(N+1)} = \sum (x/(N+1))^k/k!$ uniformly in $[0,N]$. A substitution recovers the desired series for $e^x$.

Many would think the job is done here but this is a false sense of security. Not an easy gap to spot -- including me when I first drafted. We established the equality in $L^{\infty}$ or uniformly, but in the pointwise sense this is only almost everywhere. The good news is we do not have to worry about the trap of generating the series using the series anymore so many more tools are now available like continuity. Since both the exponential function and the series are continuous everywhere, the two expressions are indeed equal everywhere.

*

I believe this is the first time I write about functional analysis here but this is actually my expertise. These are absolutely not what one would encounter in research, but very fun to share across all levels of undergraduate students.

The idea of $\sum A^k = (1-A)^{-1}$ is the cornerstone of spectral theory because when we look at the spectrum we want to see the set where $A-\lambda I$ is invertible, or equivalently where $I - A/\lambda$ is invertible. The fact that geometric series converges says that when $A$ is small enough to $I$ then $(I-A)$ is always invertible which gives us lots of information about the spectrum.

Of course, knowing that $(I-A)$ is invertible for small $A$ says nothing about any other operators in the space. In particular, it says nothing about invertibility of operators that fails the condition for geometric sum.

This is particularly clear when we revisit the matrix example. I claim that the above method applies to all diagonalizable matrices where all eigenvalues are of the same sign (and are non-zero) (proof: left to reader). But that left out almost all diagonalizable matrices where eigenvalues contain both signs. Are they not invertible? Absolutely not! The moment they are diagonalizable with non-zero eigenvalues they are straightaway invertible, just that this particular approach doesn't apply.

One problem to the readers: my proof above relies on the fact that $e^0 = 1$. Can you recreate the proof for when $x < 0$? 

We finish with an extra food for thought: one computes $M^4[\sin](x) = \sin x - x + x^3/6$, so $(I-M^4)[\sin](x) = x - x^3/6$. You already see the pattern: $(\sum (M^4)^k)[x - x^3/6]$ is precisely the series for $\sin x$. (Technical problems like convergence and range is easy like above so we left them to readers as well.) Can we do the same for any differential equation solution like functions? Does it depend on the radius of convergence? Or...does it apply to all $C^{\infty}$ functions? How interesting...

No comments:

Post a Comment