Friday 21 June 2024

Geometric sum as functional inverse (2): the backstory

My latest contribution about using geometric sum as inverses is surprisingly well-received, so I decided to give it a sequel.

The immediate question is: are we reinventing Maclaurin series? Is this even possible without differentiation? Well, not really. If we decompose $f$ into power series then we will see precisely what happened. If $f = \sum a_k x^k$ then $a_i = D^i[\sum a_k x^k](0)/i!$ where $D$ is the differential operator since I am too lazy to write fractions right? We can recover this by calculating $M^i[D^i[\sum a_k x^k](0)]$:
$M^i[D^i[\sum a_k x^k](0)] = M_i[i!a_i] = a_ix^i$.
In other words, we are just representing the $x^i$-term in forms of $a_i'x^i$ instead of constant $a_i$. 

Let us recall the definition of the (properly defined) operator $M$: $Mf(x) = \int ^x_0 f(u)du$.

The miracle of the exponential series actually happens because $M^i[1]$ is precisely the $x^i$-term of the exponential series. Clearly this is not guaranteed because $M^i[g] = a_ix^i$ for all $i$ iff $g$ is a constant. Sadly $g$ is pre-determined by the function $f$ we use.

Consider $f = \log (1+x)$. It looks bad further away from zero but it should be fine in an neighborhood around zero. Suppose we wave our hand say from $(I-M)f = g$ we obtain $f = (\sum M^k)g$, we want to check what's $M^kg$ exactly.

First we compute $g = (I-M)f = x(1-\log(1+x))$. Not good-looking but okay. Next we compute $Mg$:
$Mg = \frac{1}{2}(x^2+\frac{1}{2}x(x-2)-(x^2-1)\log(1+x))$
according to Wolframalpha. Soon you realized the fact that $M^kg$ will never be the k-th term of the Maclaurin series because the log term will never go away. 

Everything we argued in $L^{\infty}$ still works, even for this nasty $f$! Each $M^kg$ is a proper function and the sum indeed converges to $f$ uniformly if we restrict the domain to $[-\varepsilon, \varepsilon]$ for some small $\varepsilon \in (0,1)$. However, it simply does not coincides with the Maclaurin series because $g$ is not a constant. When is $g$ a constant, or rather, when is $(I-M)f$ a constant? Differential equation $y-y' = 0$ has solution $ce^x$ and is the only instance that fits all our speculation. 

What a pity.

You then asked: what about the trig example at the end? Why are we recovering the Maclaurin series like that even when the theory already crumbled?

To clarify let us repeat the above again: $g = (I-M)[\sin] = \sin x + \cos x -1$. 
$Mg = \sin x - \cos x + 1 -x$.
$M^2g = -\sin x - \cos x + 1 + x - x^2/2$.

Well...an oscillation? Nope. This is clear if you plot them on a graph.
Red is $g$, blue is $Mg$ and green is $M^2g$. As predicted by the norm of $M$, the $L^{\infty}$-norm do shrink exponentially which gives us the convergence, but just like our first example it simply does not coincides with the power series because $g$ is not a constant.

The last hope lingers on the fact that $(\sum (M^4)^k)[x - x^3/6] = \sin x$. Anything special about the trig functions that allows such representation? I will truly leave that to the readers this time. Hint: what is the relation between $\sin x$ and $\exp x$?

Sunday 16 June 2024

Use of geometric sum as inverse in function spaces

When I teach first year linear algebra, one of the main show is the use of diagonalization to simplify matrix computation then to link that with various applications. This is the real point where you feel the real use of these concepts other than making up its geometric meaning. Cayley-Hamilton, characteristic polynomial for recurrence sequences, Markov chain and so on.

To start with, consider a matrix $A$. If $\sum A^k = I + A + A^2 + \ldots $ is well-defined (aka converges) then by geometric series we know it is equal to $(I-A)^{-1}$. If we do the other way round we can compute the geometric series of $I-A$, we will recover the inverse $(I-(I-A))^{-1} = A^{-1}$. The problem is...does the geometric sum converges? This is where the eigenvalues kick in.

For diagonal matrix $D$, $\sum D^k$ converges iff all diagonal entries (aka eigenvalues) are of norm less than or equal 1. If $A$ is diagonalizable we can write $A = PDP^{-1}$, then $(I-A) = P(I-D)P^{-1}$, so we want the diagonal entries of $(I-D)$ to be in $(-1,1)$. Or rather, we want all eigenvalues of $A$ to be in $(0,2)$. But this is a very limited result. Can we do better? Yes!

Notice that we can tweek the eigenvalues by multiplying a constant. Let $c \in \mathbb{R}$ be non-zero so that all eigenvalues of $A/c$ to be in $(0,2)$, then the same procedure may proceed. 

Consider the matrix $A = \begin{bmatrix}5&4\\2&3\end{bmatrix}$ with eigenvalues 1 and 7, we take $c = 8$. Write $A = PDP^{-1}$ then $I - D/c$ has diagonal entries 7/8 and 1/8. Geometric sum gives $\sum (I-D/c)^k = \begin{bmatrix}8&0\\0&8/7\end{bmatrix}$. 

Plugging back we have $P(\sum (I-D/c)^k)P^{-1} = \frac{8}{7}\begin{bmatrix}3&-4\\-2&5\end{bmatrix}$, which is precisely $cA^{-1}$. Don't forget that $(A/c)^{-1} =cA^{-1}\neq A^{-1}/c$!

*

Now let us forget about linear algebra and just the a step forward. How about using geometric sum in function spaces? 

We start with a very vague idea. Let $M$ be the integral operator, then we obviously obtain $(I-M)[\exp] = -C$ where $C$ arises from the integral. What if $(I-M)$ has inverse $(I+M+M^2+...)$? Let's say $M[-C] = -Cx + C_1$, then $M^2[-C] = -Cx^2/2 + C_1x + C_2$, and so on. One obtains $(\sum M^k)[-C] = (-C+\sum C_i)(1 + x + x^2/2 + \ldots)$, and checking $x = 0$ gives $-C + \sum C_i = 1$. This is such a lucid dream because we just proved the series $e^x = \sum x^k/k!$ without differentiation at all!!

But wait, there are too many technical problems in the above argument to the point if I read that from my student's assignment I would have torn that paper into pieces. The gaps are not fatal, but we have to be very, very careful.

The biggest problem of all, is that integral operator in that arbitrary form, is likely unbounded. We must sacrifice our dream of working over the reals and work in $[0,N]$ instead. Since $N$ is arbitrary we can extend that basically to the reals, at least pointwisely.

In a bounded interval $L^p$ spaces work similarly but the best is $L^{\infty}$ because we immediately give the norm of the integral operator. Define $M_0:L^{\infty}([0,N]) \to L^{\infty}([0,N])$ by $M_0f(x) = \int ^x_0f(u)du$. This is well defined because $M_0f(x) \leq N\|f\|_{\infty}$, and it also implies that $\|M_0\| = N$ (the equality is obvious by taking $f = 1$). 

We solved unboundedness but we need $\| M_0 \|<1$ for that to work. What should we do now? Well, just copy what we did to matrices. Define $M = M_0/(N+1)$ and try again: $M[e^x] = (e^x-1)/(N+1)$ but $I[e^x] = e^x$...they don't add up. It turns out that tweeking $M$ won't suffice because a tweek to the function is necessary as well. 

Let $f = e^{x/(N+1)}$, then $Mf = (e^{x/(N+1)} - 1)$ and so $(I-M)f = 1$. Since $\|M\| < 1$, $\sum M^k$ is a well-defined operator and is the inverse of $(I-M)$. Therefore $f = (\sum M^k)[1]$. Induction gives $M^k[1](x) = (x/(N+1))^k/k!$, so $e^{x/(N+1)} = \sum (x/(N+1))^k/k!$ uniformly in $[0,N]$. A substitution recovers the desired series for $e^x$.

Many would think the job is done here but this is a false sense of security. Not an easy gap to spot -- including me when I first drafted. We established the equality in $L^{\infty}$ or uniformly, but in the pointwise sense this is only almost everywhere. The good news is we do not have to worry about the trap of generating the series using the series anymore so many more tools are now available like continuity. Since both the exponential function and the series are continuous everywhere, the two expressions are indeed equal everywhere.

*

I believe this is the first time I write about functional analysis here but this is actually my expertise. These are absolutely not what one would encounter in research, but very fun to share across all levels of undergraduate students.

The idea of $\sum A^k = (1-A)^{-1}$ is the cornerstone of spectral theory because when we look at the spectrum we want to see the set where $A-\lambda I$ is invertible, or equivalently where $I - A/\lambda$ is invertible. The fact that geometric series converges says that when $A$ is small enough to $I$ then $(I-A)$ is always invertible which gives us lots of information about the spectrum.

Of course, knowing that $(I-A)$ is invertible for small $A$ says nothing about any other operators in the space. In particular, it says nothing about invertibility of operators that fails the condition for geometric sum.

This is particularly clear when we revisit the matrix example. I claim that the above method applies to all diagonalizable matrices where all eigenvalues are of the same sign (and are non-zero) (proof: left to reader). But that left out almost all diagonalizable matrices where eigenvalues contain both signs. Are they not invertible? Absolutely not! The moment they are diagonalizable with non-zero eigenvalues they are straightaway invertible, just that this particular approach doesn't apply.

One problem to the readers: my proof above relies on the fact that $e^0 = 1$. Can you recreate the proof for when $x < 0$? 

We finish with an extra food for thought: one computes $M^4[\sin](x) = \sin x - x + x^3/6$, so $(I-M^4)[\sin](x) = x - x^3/6$. You already see the pattern: $(\sum (M^4)^k)[x - x^3/6]$ is precisely the series for $\sin x$. (Technical problems like convergence and range is easy like above so we left them to readers as well.) Can we do the same for any differential equation solution like functions? Does it depend on the radius of convergence? Or...does it apply to all $C^{\infty}$ functions? How interesting...