Friday 21 November 2014

A trip to South Island, New Zealand (2): Akaora

Although Lyttelton is the centre of the 2011 earthquake it doesn't look like it's more damaged than the Christchurch city centre, this is partially because their assets as a tourist spot is quite localized and the cost to recover them is relatively lower (of course it really takes some money and time to recover various monuments). This is the same for the rest of the Banks peninsula, like Akaroa.

Sunday 9 November 2014

A trip to South Island, New Zealand (1): Christchurch

Welcome back ladies and gentlemen. I don't have much to talk about academic stuff recently simply due to the high workload I have had in the last semester, which is now over. I've now got some time to write about my trip, which is an amazing experience.

The trip included some typical spots but not all of them. In particular most of the time was spent into outdoor activities so I didn't have much city experience this time (except Christchurch I guess). Anyway I guess people who travel in New Zealand expect such kind of tours right?

I planned to write a series of entries about the trip and hopefully this will be updated weekly. I hope this serves as a guide for those who are interested to travel like me, who do not used to spend a lot in the tours but to enjoy the pace of a place. And of course, to enjoy the food.

Tuesday 5 August 2014

Text on introductory probability

To access the document version of the notes please go to the "Notes Corner" or my personal site directly.

The document can be viewed as the completion of a previous entry NSS combinatorics.


The document aims to provide a really basic introduction to probability. Elementary calculus(differentiation, integration, FToC) is assumed. They will be used in continuous random variables. Note that the text is not of a very pure taste. Computational aspects are covered in the text because it is a vital part of modern theory as well. Sometimes R codes and pseudo codes are included here, and it is perfectly fine to skip them.


Ch.1 Set and probability space
Ch.2 Combinatorics
Ch.3 Random variable
   --- RVs, transformations, conditionals, EVs
Ch.4 Randomness generation
Ch.5 Sum of random variables
   --- Convolution, law of large number, CLT


Sunday 20 July 2014

20 July 2014

What a shame that I didn't write anything for the last 3 months here. I don't have any issues in mathematics to talk about here, so this is a rare entries with my own updates.

Firstly a faithful congratulation toward the schoolmate in my secondary school, who won a silver medal in IMO 2014 at a relatively young age. Without doubt that he is capable and will achieve more in the future.


RAMMASUN (09W) whose forced a typhoon signal no. 3 has just been dissipated in Vietnam. This is a rare occasion when a typhoon can develop greatly in northern South China Sea close to the coastal line. Traditionally geographical environment is not very favouring for super typhoon because the sea body is not large enough (comparing with East Sea and north west Pacific Sea) and not deep enough (when the typhoon absorbs the heat in the sea surface, warmer sea water comes up to provide more energy) to support a strong typhoon. However RAMMASUN reached 135kt before landing under the assistance of most if not all the other positive factor, and once more we can appreciate the beauty of nature here. Of course we also hope that those who are injured, homeless, or lost their beloved due to the storm, will get well soon.

09W has gone, but MATMO (10W) is coming, also with 95W and 96W. For a typhoon fans it is reasonable for us to expect more. It is worthy to mention that HKO again has an almost perfect prediction on the track of RAMMASUN in the South China Sea. Although MATMO is still in the Pacific side, it is interesting to see which observatory can make an accurate prediction on the behaviour of the storm as well as the subtropical ridge.


Chelsea my favourite team has bought a few stars this transfer window that we will expect more next year. We have to say, even after the World Cup, that Luiz is a crucial player against the strong teams. Without him the team can't really achieve a stunning 5W1L against the other top 4 teams. Of course, 50M pounds is a good deal, then we have to find more suitable players to substitute him. Yesterday I watched the 3-2 reversal against Wimbledon where Terry score twice --- I can see many young players with potentials but yet need more real bloody experience. Want some more,Vitesse?


I have written a set of notes for my friends on probability. After some reforming I will probably release it here. Same as the complex analysis last semester, the courses that I will take this semester --- functional analysis and manifold analysis is quite specific and I may not be able to address something easy and meaningful here, but I will try, if this is possible.

Friday 18 April 2014

No game, no life --- game theory case analysis

No game no life is a recent anime that highlights the world that takes game contract as the highest principle, and game theory plays a superior role in the story in a very natural way. Now I would like to analysis one of the game in episode 2 as follows. For simplicity I would use Bob and Alice --- the common name in cryptography --- to describe the whole story.

Bob "Hey Alice let's play paper-scissor-rocks :) I will always play paper."
Alice "What if you don't play paper?"
Bob "If I play don't play paper you win if it's a draw [under original rules] and you draw if you lose [under original rules]. If I win you have to obey my order."
Alice "What if I get a draw?"
Bob "Then you will have to do my little favour :P"
Alice "What favour?"

The payout table is as follows

It turned out that Alice had scissor and Bob had rock --- a draw --- with consequences equivalent to a lose as Bob defined. The analysis from Alice: based on random choices there are a 2/3 chance to win for rocks and scissors, but Bob claimed to play scissor so she'd better play scissor. Bob made a conclusion that the risk for Alice is invariant despite the rules:


Consider the following payout matrix.

$P_1 = \begin{bmatrix}1&0&1\\1&1&0\\-1&1&0\end{bmatrix}$, $P_2 = \begin{bmatrix}1&-1&1\\1&1&-1\\-1&1&-1\end{bmatrix}$

Where $P_1$ is the matrix that Alice thought to be but $P_2$ is the real one. The matrix $P$ is defined as follows: $[P]_{ij}$ is the amount you win if you play the j-th option (vertical) and the opp plays the i-th option (horizontal). (Some textbooks may switch between you and your opp but it's just a matter of perspective on whether you, the analyzer is taking part in the game.)

By strong duality we know that the optimal strategy for both Bob and Alice should have the same expectation. Now suppose Alice has a probability vector $\vec{x}$ to play the three choices with payout matrix $P$. Then the expected outcome for Bob is given by $P\vec{x}$ for different choices from Bob. He as the payer would of course want to minimize the paid. Then we can set up the following linear program for Alice, to maximize the gain when Bob minimizes her gain among the three choices.

$\max (\min ([P\vec{x}]_i)) ~~~\text{subject to}~~~\sum [\vec{x}]_i = 1, \vec{x}\geq \vec{0}$.

And by adding dummy variable we have:

$\max z ~~~\text{subject to}~~~P\vec{x} - z\vec{b}\geq \vec{0},~\sum [\vec{x}]_i = 1,~\vec{x}\geq \vec{0}$

Where $\vec{b} = (1,1,1)^T$. This will be the optimal strategy for Alice.

Using $P_1$ we have $\vec{x}^* = (0,0.5,0.5)^T$ with expected gain $0.5$. Using $P_2$ we have the same optimal vector, but the expected gain is zero. It can be concluded that Alice had the perception that she has an advantage in the game, while actually she had not.

The matrix with four -1 and five 1, turns out to be a fair game (mathematically).


Gambling at high stakes are of course a totally different game. There are not enough games for one to apply central limit theorem so that the result converges so that they can play in the most probable way. In these games all factors including the psychological status shall be considered and here Alice obviously did not have a good enough mind to do such analysis, so she eventually lose the game.

Can you think of other non-trivial matrix which is fair as well?

Sunday 30 March 2014

Sequence of random variables of smaller variance in average

Define the problem as follows.

Suppose we have a sequence of random variables $X_1, X_2,...$ that takes value $\left\{ 0,1\right\}$ with $\lim _{n\rightarrow \infty} \frac{\sum_{i=1}^n X_i}{n} = 0.5$. i.e. In long term this is a fair coin toss. Design an mechanism so that for some $n$ and $k$, the distribution of $\sum_{i=n}^{n+k-1} X_i$ has a smaller variance.

It is very clear that if the variables are independent then the distribution is always fixed. Therefore we focus on dependent variables, in particular we consider creating a Markov Chain, that $X_{n+k}$ depends on $\frac{\sum_{i=n}^{n+k-1} X_i}{k}$. We call this dependence function be $f: [0,1]\rightarrow [0,1]$, which is rotational symmetric around $(0.5,0.5)$.

In order to investigate the effectiveness of our mechanism we have to go deep into the calculations because the mean calculations won't work (we need variance!)

Consider $k=2$ with $f(x) = 0.9 - 0.8x$ and the four states are $00,01,10,11$. Then the transition matrix is given by:

$P = \begin{bmatrix} 0.1&0&0.5&0\\0.9&0&0.5&0\\0&0.5&0&0.9\\0&0.5&0&0.1\end{bmatrix}$

Where the transition matrix assuming independence equal to

$P = \begin{bmatrix} 0.5&0&0.5&0\\0.5&0&0.5&0\\0&0.5&0&0.5\\0&0.5&0&0.5\end{bmatrix}$

Since $P$ represents a regular Markov chain it's clear that it has an eigenvalue 1 with the steady state vector $\frac{5}{28}(1,1.8,1.8,1)^T$. Comparing with $(0.25,0.25,0.25,0.25)^T$ this one has a high tendency at the central.

What about higher k? Consider the case $k=3$ with 8 states $000,001,010,011,100,101,110,111$, where

$P =\begin{bmatrix} \frac{1}{10}&0&0&0&\frac{11}{30}&0&0&0\\ \frac{9}{30}&0&0&0&\frac{19}{30}&0&0&0\\ 0&\frac{11}{30}&0&0&0&\frac{19}{30}&0&0\\ 0&\frac{19}{30}&0&0&0&\frac{11}{30}&0&0\\ 0&0&\frac{11}{30}&0&0&0&\frac{19}{30}&0\\ 0&0&\frac{19}{30}&0&0&0&\frac{11}{30}&0\\ 0&0&0&\frac{19}{30}&0&0&0&\frac{9}{10}\\ 0&0&0&\frac{11}{30}&0&0&0&\frac{1}{10}\\\end{bmatrix}$

With steady state vector, according to mathematica, $c(0.161905, 0.397403, 0.397403, 0.397403, 0.397403, 0.397403, 0.397403, 0.161905)^T$ (where $c$ is the constant to make the sum of the probabilities 1.) Surely there is a higher tendency to stay at the centre, but it looks less centre-located than $k=2$. Of course it's very hard to determine the variance through the steady state vector from the Markov chain as every states represents a series of random variables, but we can plot the distribution for it as follows. Here we sample 10000 times for $\sum_{i=1}^{100} X_i$.

Here we see the black line as the standard binomial distribution, red as $k=2$, green as $k = 8$, blue as $k=25$ and mathematically the line tends to the standard binomial distribution as $n \rightarrow \infty$. If we take a function that is closer to $f(x) = 0.5$ (take the distance in inner product space, if you like) it will, of course, closer to the original distribution.

For the above graph, red refers to $y = 0.9-0.8x$, green is $y = 0.8-0.6x$ while blue is $0.7-0.4x$, and the difference is obvious.

Convexity also affect the ability of 'forcing' the RV back to the mean using the same idea (distance between functions). The variance is naturally smaller if $f(x)$ is larger approaching to $0.5^-$ and smaller from $0.5^+$. If you do some Fourier analysis we can 'force' the random variable sequence into a perfect alternating 0 1 0 1 0 1 0 1!!!

The red line refers to $f(x) = 0.5 + 0.4((1-x)^5-x^5))$, the green line being $f(x) = 0.9-0.8x$ and the blue line is $f(x) = 0.5+0.4\cos (\pi x)$.

Let's demonstrate the Fourier series correlation here. For the function $f: [0,1]\rightarrow [0,1]$ where $f(x) = 1$ for $x\in [0,0.5]$ and $f(x) = 0$ for $x\in (0.5,1]$ we have $f(x) = \frac{1}{2} + \sum_{k=1}^{\infty}\frac{2\sin ((4n-2)\pi x)}{(2n-1)\pi}$. Turncating the values that exceeds 1 and below 0 we have the following distribution by taking $1,2,3$ terms with $k=15$:

We put a large graph here because it takes great effort for us to separate the three lines. Zooming gives

We see (once more) that Fourier series converges very quick despite the occurrence of Gibbs' phenomenon here. There's an interesting question that why does the distribution converges to something non-trivial instead of going to the impulse $\delta (\frac{n}{2})$, where the sequence is perfectly 1 0 1 0 ......? I'll leave this question to the reader as the solution is quite simple.

I don't intend to 'teach' something here but creating a sequence of dependent random variables is a very practical topic and the variance could play an important role. For instance in some round-based RPG battle you certainly don't want your game to be dominated by randomness, so it makes sense to restrict the randomness so that it's about the same for both sides over most if not all short periods. The faster converging speed is what we want to tweek the variance.

Foods for thought.

1) Why does the distribution using Fourier series as $f(x)$ does not converge to an impulse, where the sequence of distribution has no variance? You may want to change the following variables to see what happens (a) $n$, the length of sequence simulated (b) $k$ the length of correlation (c) the number of terms in the Fourier series or the turncation (d) numerical precision.

2) It is possible to calculate the variance directly from the Markov chain steady state probability?

3) By inverting  the idea we can't create sequence of RVs of larger variance directly, for instance, $f(x) = 0.5-0.4\cos (\pi x)$ doesn't give a sequence of larger variance. Think about if it is possible, then see if there are practical applications for it.

4) For the Markov Chain case $k=3$, why does the steady state probability for all 6 states except $000,111$ are equal? (i.e., That looks 'unusual'.)

R code, for $f(x)$:

f  = function(n,k,g) {
  s = NULL
  for(i in 1:k) {
    s = c(s,rbinom(1,1,0.5))
  for(i in (k+1):n) {
    prev_mean = mean(s[(i-k):(i-1)])
    s = c(s,rbinom(1,1,g(prev_mean)))

Saturday 1 March 2014

Triangle centres and choice of coordination

Recently I have a few tutorial sessions with others on NSS maths and olympiad maths and there are a few questions on geometry which is not very easy to be solved by analytical method. With or without extra restrictions we may be able to solve them in coordinate system, but sometimes we don't know where should we start our calculation. What should we do to make these calculations much simplified?

Before doing this we can review a few operations in coordinate geometry in senior secondary school level.

1) Straight lines, in general form $y = mx+b$, or point slope form $(y-y_0) = m(x-x_0)$.

For parallel translation consider $y= mx + b + \epsilon $ or $(y- y_0- \epsilon _y ) = m(x-x_0 - \epsilon)$.

For perpendicular lines consider lines with slope $-m^{-1}$.

2) Centres of triangle. They are discussed in the previous entry.

3) Locus. Common locus includes straight line(s), circle and conic sections. The standard form can be generalized by $a(x-x_0)^2 + b(y-y_0)^2 = r^2$.

Mathematics is consistent (at this level) in the sense that using different methods will give you the same answer. However there are many approaches, in particular there are many method to define a geometry problem in the coordinate plane.

To define a problem in the coordination system first we define the coordinate system to be used. For 2D geometry the three choices are Cartesian, polar and Argand coordinate. Cartesian is the obvious choice as polar produces ALOT of trouble when you operate points besides the origin (we will see this later), and argand deals more with dynamic stuffs like transitions, but we don't want things to move. In the static case Argand is similar to Cartesian but the later can be much easier.

The second step is to put the points into the coordinate. You can rotate and transform freely as long as this does not violate the original given information. A very common transformation is to put two points of the triangle at (0,0) and (0,1), then the last point can be placed at $(x,y) \in {\mathbb{R}^+}^2$. (This could lead to a paradox, but this is beyond our scope.)

The third step is to deduce the transformed relation in the new coordinate, which is technically the most complicated part. You will find angle bisector giving you so much pain. Consider the following example.

Example 1. Given that $\tan (\theta ) = k$. Show that $\tan (\frac{\theta}{2}) = \frac{k}{\sqrt{1+k^2}+1}$.

This can be performed on a right-angled triangle, then the general case immediately follows. (why?)

Consider three points $A(0,0)$, $B(1,0)$ and $C(0,k)$ (since this is a right-angled triangle) and $D$ be the point of intersection of $AC$ and the angle bisector of $B$.

By angle bisection theorem, $\frac{DA}{DC} = \frac{BA}{BC}$ so that $DA = \frac{k}{\sqrt{1+k^2}+1}$ and the result follows.

Example 2. Verify that $r = \frac{a+b-c}{2}$ where $a,b,c,r$ is the sidelengths and inradius respectively if (and in fact, only if), $ABC$ is a right-angled triangle.

We had a proof using Pyth. them to get the result directly, in a much easier way, but this time we try to use a purely coordinate-geom approach.

Here we take the same set up with example 1. Then the angle bisector of $A$ is given by $y=x$ (if the triangle is not right-angled the formulation of this one will be complicated too). The slope $m_{CB} = -k$ and the angle bisector of $B$ has slope $m_{B'} = \frac{-k}{\sqrt{1+k^2}+1}$. Then we have two equations to find the incenter (two is enough, assuming the fact that angle bisectors are concurent): $y = x$ and $y = m_{B'}(x-1)$, which gives us $y = x = r = \frac{k}{1+k+\sqrt{1+k^2}}$. $r=y$ because the foot from the incenter to $AB$ is vertical.

Notice that all right angled triangle can be transformed into the above form, then $r =  \frac{k}{1+k+\sqrt{1+k^2}} = \frac{1+k-\sqrt{1+k^2}}{2}$ can be shown by rationalization.

When you have an isosceles triangle you may want it to be symmetrical in your coordinate system.

Example 3. (PCIMC 2014 Heat SS Q17) In triangle ABC, $AB=AC$. $D$ is on $AB$ so that $AD:DB = 2:5$. Extend $BC$ to $E$ so that $DE = EC$. If the area of triangle DEB is 10, find the area of triangle ABC.

Note that in this question transformation does not change the given information. Notice that there is a ratio $2:5$ so we may want to put $A$ at $(0,7)$ and do the rest of the calculations. However this question can be solved easily with rotation only (rotation do not alter ANY information, unless they give you a pre-assigned coordinate with slopes, etc.). Put $A$ at $(0,a)$ with $C(0,x)$, $B(0,-x)$. Then $D$ is at $(-\frac{2}{7}x, \frac{5}{7}a)$ (why?) and by symmetry $E$ is at $(\frac{-11}{7}x,0)$.

Now consider the area of triangle DEB we have $\frac{1}{2} \frac{4x}{7} \frac{5a}{7} = 10$, which gives us the area of triangle ABC $\frac{(2x)(a)}{2} = 49$.

Readers may want to try the case when $a$ is fixed. Now we perform one classic example: the collinearity of centres of triangle. It may look hard but this is not that hard as one imagined as: (1) it does not involve the stupid incenter, with all those complicated trigonometric functions (2) we use a nice coordination to make calculation easier.

By arbitrary transformation consider the triangle $A(0,0)$, $B(1,0)$ and $C(a,b)$ at the first quadrant.

The centroid is easy: $G(\frac{a+1}{3}, \frac{b}{3})$.

For the orthocenter, notice that $m_{BC} = \frac{y}{x-1}$ so that the altitude from $A$ to $BC$ has slope $\frac{1-x}{y}$. Moreover the altitude from $C$ to $AB$ is vertical so that we have two equations: $x=a$ and $y = \frac{1-a}{b}x = \frac{a(1-a)}{b}$ so that the orthocenter is at $H(a,\frac{a(1-a)}{b})$.

For the circumcenter, note that the perpendicular bisector of $AB$ is $x=\frac{1}{2}$. Now the perpendicular bisector of $BC$ has equation $y = \frac{b}{2} + \frac{1-a}{b}(x-\frac{a+1}{2})$. Solving the equation we have the circumcenter $O(\frac{1}{2}, \frac{b^2 - a(1-a)}{2b})$.

The coordinate of the circumcenter may not look good, but we can verify that the slope of the three points are equal to $\frac{-3a^2+3a-b^2}{(2a-1)b}$ so that they are collinear. Are there any better coordination? Please don't hesitate to tell me.

Food for thoughts:
1. (PCIMC 2014 Heat SS Q12) In triangle ABC, D is the mid-point of BC. E and F are on AB, AC respectively so that BC//EF. If $S_{AEF}:S_{DEF} = 7:3$ (area ratio) and $S_{ABC} = 100$, find $S_{DFC}$.

2. (PCIMC 2013 Final SS Q14) For a kite ABCD, $AB = AD = 50$, $CB = CD = 75$. E,F,G,H are on AB, BC, CD, DA respectively so that EFGH is a square. Moreover let angle $ABC, ADC$ be right angle and EF//AC. Find the area of the square.

Friday 28 February 2014

Speech of the day

Composing is like setting up equations in a linear space and what composer do is to bring you to a higher space, allowing more equations and possibilities.

This will be my new quote in the about page, as this fantastic statement perfectly shows some of my interests.

Any thoughts?