Thursday, 26 December 2013

Christmas songs



I joined a few musical events before the end of 2013 which include the nutcracker performance (for those who follow don't shoot the pianist, you will know that Nov and Dec are always the time for such performance lol) but I also spent some time on reviewing old and new Christmas songs. By chance I found the game BGM Christmas in the 13th month which is a wonderful piece after all. Now I'm enjoying a succeeder of the song, dancing Christmas in the 13th month's remix from sync. art (I guess it's well known for their touhou remixes) so I hope that you will enjoy them too. Well the above one isn't from sync. art but the best performance among all clips in my opinion that the smooth sound from the viola greatly enhance the dancing that mixes well with the lively going background.

Merry Christmas.

Thursday, 19 December 2013

A rare diary

It's long since my last diary written here but this is an important one. As some sort of researcher in the field of computational complexity, I ought to dig the SIMC problem out and make an in depth analysis on it, in particular the complexity classification, and algorithm solving them.

Allow me to brief the problem here.

In the graph, assume every edge has weight 1 that the distance can be calculated accordingly. We also define the distance between edges and vertices. For $e = (u,v) \in E$, $d(e,w) = 0.5 + min(d(u,w), d(v,w))$ and the distance between edges immediately follows. It is clear that the distance measurement is metric. (why?)

For a graph $G = (V,E)$ with two disjoint sets of vertices $V_s, V_c$, define score to be the number of edges closer to $V_c$. i.e. $d(e, V_c) \leq d(e, V_s)$. If the distances are equal it counts as 0.5.

Let's have a function to describe the score: $f(G, V_s, V_c) =~score$.

Instance: $G = (V,E)$ and two disjoint sets of vertices $V_s, V_c \subseteq V$, with some further constraint specified upon the challenge.
Output: Another set $V_{c'}\subseteq V$ that is mutually disjoint with $V_s, V_c$.
Goal: Maximize $f(G,V_s, V_c \cup V_{c'})$.

Maybe you have already heard it before, that Greedy algorithm works really well for this problem. Is it the best solution? It is clear that the greedy algorithm is a polynomial time algorithm. Then is this problem in P? Is it NP? This will be an exciting review of the old problem.

Sunday, 15 December 2013

GCD and exponents

LCM (least common multiple) and GCD (greatest common divisor) (for integers as a integral domain) is a relatively easy field that indicates similar behavior for some numbers. In the previous entry I have constructed some related identities using prime factorization, and here I would like to introduce a few related problems that we use the properties of GCD and LCM to solve it.

Definition. For natural numbers $a_1, ..., a_n$, let $[a_1,...,a_n]$ be their LCM and $(a_1,...,a_n)$ be their GCD.

Theorem. For $m,n\in \mathbb{N}$, $mn = [m,n](m,n)$.

Q1. (Korea MO 2013, Day2 Q1) Find all $f: \mathbb{N}\rightarrow \mathbb{N}$ so that $f(mn) = [m,n](f(m), f(n))$.

When you see the above theorem a very natural solution is that $f(m) = m$ but we are asked to find all of them. Is this possible?

Substitute $(m,n) = (1,k)$, we have $f(k) = k(f(1),f(k)) = [1,k](f(1),k(f(1),f(k)))$. Let $(f(1),f(k)) = m$ then $f(k) = k(f(1),km) = km$. It is clear that $m \mid f(1)$ so that $f(k) = k(f(1),km) = kmp$ where $p\in \mathbb{N}$. Clearly, $p=1$ so that $(f(1), k) = 1$. Since $k\in \mathbb{N}$ is arbitrary, we have $f(1) = 1$, and $f(k) = k(f(1),km) = k$.

Here we should notice a fact:

Fact 1. $[a_1,...,a_n] \geq max(a_1,...,a_n)$ and $(a_1,..,a_n) \leq min (a_1,...,a_n)$

GCD is something that decrease when you repetitively take it, and if you can take it arbitrarily that gives the same result, the number is probably 1.

In this question, we put the expansion of $f(k)$ to get a looped GCD brackets, but it can be taken alternatively:

Fact 2. For $m,n\in \mathbb{N}$, $(m,n) = (m,(m,n))$.

The next question is about GCD identity on more variables.

Q2. Find all $(a,b,c) \in \mathbb{N} ^3$ so that $[a,b,c](a,b,c) = abc$

We use the proven identities from the last entry:

Theorem. For $a,b,c\in \mathbb{N}$, $[a,b,c]^2 \prod _{cyc} (a,b) = (a,b,c)^2 \prod _{cyc} [a,b]$.

Now $[a,b,c]^2(a,b,c)^2 = (a,b,c)^4 \prod _{cyc} [a,b]/(a,b) = (a,b,c)^4 \prod _{cyc} (ab/(a,b)^2) = (abc)^2 (a,b,c)^4/ \prod _{cyc} (a,b)^2 $ and we have

$[a,b,c](a,b,c) = abc (a,b,c)^2/ \prod_{cyc} (a,b)$

To make the last term 1 it is clear that all numbers must be relatively prime. To see why simply note that $(a,b)^2 \geq (a,b) \geq (a,b,c)$ and equality holds iff $(a,b) = 1$ so that we need $(a,b) = (b,c) = (c,a) = 1$, then $(a,b,c) = 1$.

Clearly all relatively prime triplets satisfies the question.

Sometimes the GCD and LCM notation takes all prime factors into account, but what if we want to consider a particular prime? We can prove the case for a certain prime and by enumerate through the primes we get the desired result as every natural numbers (which is finite) can be factorized into (finite) primes.

Definition. (Highest exponent) For $n\in \mathbb{N}$ and prime $p\in \mathbb{P}$ denote $v_p(n) = k$ if $p^k \mid n$ but $p^{k+1}\nmid n$. If $p\nmid n$ then $v_p(n) = 0$. (In some texts we don't allow 0 but here we just let it be)

Then we can prove GCD results in terms of prime exponents, and the proof is naturally completed by the FTOArithmetic.

Example 1. $v_p([a,b]) = max(v_p(a), v_p(b))$ and $v_p((a,b)) = min (v_p(a), v_p(b))$.

Example 2. Since $max(a,b) + min(a,b) = a+b$, we have $v_p([a,b](a,b)) = v_p([a,b])+v_p((a,b)) = a+b$, which proves the identity $(a,b)[a,b] = ab$.

Example 3. When there are more variables and the equation of interest is cyclic we can WLOG assume the order, here let $v_p(a) =x$, $v_p(b) = y$ and $v_p(c) = z$ and WLOG let $x\geq y\geq z$. Then

$v_p([a,b,c]^2\prod (a,b)) = v_p((a,b,c)^2 \prod [a,b]) = 2x+y+2z$

Here we have an alternate prove for Q2, from mathlinks, using the exponent notation:

Let $v_p(a) =x$, $v_p(b) = y$ and $v_p(c) = z$ and WLOG let $x\geq y\geq z$. The question is equivalent to $x+z = x+y+z$ so that $y = 0 \geq z = 0$, so that $(a,b) = (b,c) = (c,a) = 1$ due to cyclic assumption. Then $(a,b,c) = 1$ and $[a,b,c] = abc$.

Q3. Find all triplets $(a,b,c) \in \mathbb{N}^3$ so that $a+b = (a,b)^2$ in cyclic manner.

We always get trouble when addition steps into questions on GCD. To do this notice that for prime $p\geq 3$, it is impossible that $v_p(a+b) \geq v_p ((a,b))+1$ for all $(a,b) \leftarrow (a,b), (b,c),(c,a)$ (why?) Then we have $(a,b)$ equals to 1 or 4, which allow us to enumerate all cases.

Using the exponent notation we can have a very powerful theorem called the lifting the exponent lemma, that allows us to use the exponent notation to find the highest power of some complicated forms. These can be easily searched so I will not cover it here. It is actually frequently used in current Olympic tournaments!

Three little problems:

1) (Russian MO 1999) Find all bounded sequence $\left\{ a_n \right\} \subseteq \mathbb{N}$ where for all $n\geq 3$, $n\in \mathbb{N}$, $a_n = (a_{n-1}+a_{n-2})/(a_{n-1},a_{n-2})$.

2) (A classical lemma) For $a,m,n\in \mathbb{N}$ and $a\geq 2$ show that $(a^m -1, a^n - 1) = a^{(m,n)}-1$.

3) Describe all the triples $(a,b,c)\in \mathbb{N}^3$ so that $[a,b,c](a,b,c) = \sqrt{abc}$.

Sunday, 24 November 2013

A little analysis problem

Recently I've been reading Springer GTM on hyperreals, but I'm not good enough to take any of these into the discussions here. I'm going to discuss a nice little problem here but due to the competitive nature of this question I am not going to disclose the source of it (maybe till the competition being due).

Question. Suppose $\left\{ a_n\right\} \subseteq \mathbb{N}$ strictly increasing. Show that $\sum_{i=1}^{\infty} \frac{a_{i+1}-a_i}{a_i}$ diverges.

A geometric interpretation gives you the conclusion immediately:

Solution. $\sum_{i=1}^{\infty} \frac{a_{i+1}-a_i}{a_i} \geq \sum_{i=1}^{\infty} \sum_{j=a_i}^{a_{i+1}-1} j^{-1} = \sum_{i = a_1}^{\infty} i^{-1}$

That clearly converges as indicated by the comparison test since the last one is a partial harmonic series. Please note that the above expression should be avoided as the $\geq $ sign is vague due to the divergence nature of the series.

Now...

Question 2. Suppose $\left\{ a_n\right\} \subseteq \mathbb{R}^+$ strictly increasing. Will $\sum_{i=1}^{\infty} \frac{a_{i+1}-a_i}{a_i}$ converge?

Extending the above proof to $\mathbb{R}$ seems to be a nice idea. To do this we shall compare the term $\frac{a_{i+1}-a_i}{a_i}$ with the integral $\int ^{a_{i+1}}_{a_i} x^{-1}dx$. We can show that they are usually asymptotically equivalent. Why usually? Let's bound them as the following:

The lower bound is clear (we use that for divergence). Suppose the sequence is unbounded then

$\sum_{i=1}^{\infty} \frac{a_{i+1}-a_i}{a_i} \geq \sum_{i=1}^{\infty} \int ^{a_{i+1}}_{a_i} x^{-1}dx = \int ^{\infty} _{a_1} x^{-1} dx$

Of course if it is bounded then it converges to $k\in \mathbb{R}$ then the series is lowerly bounded by $\int^k_{a_i} x^{-1} dx = \ln (k) - \ln (a_1)$.

What bothers us is the upper bound. Since $\left\{ a_n\right\}$ strictly increasing we use the convention that $k$ being limit of the sequence $\left\{ a_n\right\}$, which is infinity if it diverges.

Define $N_i = \frac{a_{i+1}}{a_i}$. Then if $\left\{ N_i\right\}$ upperly bounded by $N$ then the series is also upperly bounded by $\int ^{ k}_{a_1} Nx^{-1}dx = N(\ln k - \ln 1)$, so if $N_i$ and $a_i$ bounded then the series is bounded as well.

Is it possible that $a_i$ bounded but $N$ being unbounded? The answer is clear: no. Suppose $a_i$ upperly bounded by $A$. Then $N_i = \frac{a_{i+1}}{a_i} \leq \frac{A}{a_1}$ so that it is also upperly bounded.

Solution. The series converges if and only if the sequence $\left\{ a_n\right\}$ converges. i.e. iff the sequence is bounded above.

Define $f(x): \mathbb{R}^+ \rightarrow \mathbb{R}^+$ that $f(x) = \sum_{i: a_i \leq x} \frac{a_{i+1}-a_i}{a_i}$. Then it would be nice to do some asymptotic analysis over the function $f$ given that $a_i$ diverges.

From our previous analysis, if $N_i$ bounded then $f(x) \sim \log x$. What if $N$ unbounded? If we think about how $N_i$ is defined we can make up this example: $a_n = n!$ then $\frac{a_{i+1}-a_i}{a_i} = n$ .

Then it is clear that $N_i$ unbounded as $N_i = i+1$ with $f(n!)\sim n^2$, but if $N_i$ bounded we have $f(n!) \sim \log (n!) \sim n\log n$ by Stirling's formula, which has a different growth rate.

Foods for thought:

Given strictly increasing sequence $\left\{ a_n\right\}$ where $a_n \neq 0 \forall n\in \mathbb{N}$, define $b_i = \frac{a_{i+1}-a_i}{a_i}$.

1) Given a sequence $\left\{ a_n \right\} \in \mathbb{R}/\left\{ 0\right\}$. Give the condition that the series $\sum b_i$ converges.

2) Repeat Q1 if $\left\{ a_i\right\}\subseteq \mathbb{R}^+$ is increasing

3) For $k\in \mathbb{N}$, Define $b_{i,k} = \frac{a_{i+k} - a_i}{a_i}$ where $\left\{ a_i\right\}\subseteq \mathbb{R}^+$ strictly increasing. Determine the condition of convergence for $\sum_i b_{i,k}$.

4) Back to the original question. How fast can $f(x)$ grow? Are there any asymptotical boundings?

Sunday, 6 October 2013

An algorithm to weight influences

This is my first few times using MathJax! If you have any troubles in viewing these TeX formula don't hesitate to tell me.

Introduction

There are many types of contests that require subjective judging like diving as well as osu! mapping contests. We are human that it is unavoidable that bias may occur even though there are well-written guidelines to follow. The aim of this entry is to make an attempt on correcting such bias by an algorithm, and to determine the degree of bias occurred in the judgement.


Wednesday, 25 September 2013

Thoughts on minesweeper

Minesweeper has been a classic for those who had been using the Window OS, and those who born in the MSN era will not forget the minesweeper battle (and I wrote a few entries about minesweeping in the past, in Chinese)

For those who don't know what is minesweeper, I seriously recommend you to have a go for it. We can define the minesweeper formally:

Initial status: a rectangular matrix with all grids closed.
State of grids:
1) Closed.
2) Opened: showing a number, which equals to the number of mines in the 8 neighboring grids (5, or 3 if grid is on the boundary).
3) Checked: the grid is marked to contain a mine. (No grids will contain more than one mine) Also, assume wrongly opened grid to be checked.
Objective: Turn all closed grids into either opened or checked CORRECTLY.

Minesweeper sometimes depends on luck, but it depends on logical deduction as well. I won't go deep into the logical stuffs here but here we analysis the principle behind logical deduction.

Just as what we learn from computation complexity, every NP-complete question is as hard as the 3-SAT (satisfibility) question. The minesweeper game is not necessarily a NP-complete question but we can still turn them into SAT question.

For the following question, we use white square or bracketed numbers for closed grids, black square for checked grids, and numbers for opened grids. Assume the boundary of the given grid is the true boundary of the board unless otherwise specified.

In our example, any bracketed number $(n)$ is equivalent to the boolean variable $a_n$, which is true if it has a mine, or zero otherwise. For SAT questions, clauses are "OR" inside and "AND" outside, but we can but too lazy to turn them into a standard SAT format.

Example 1. $\begin{matrix} (1) & (2) \\ 1&1\end{matrix}$

The information given from the two '1' s are equivalent, and it this $a_1 \bigoplus a_2$.

Example 2. $\begin{matrix} (1) & (2) & (3) & (4) \\ \blacksquare & 2 & 3&\blacksquare \end{matrix}$

How do we express "1 out of 3" with binary boolean operator? The solution is "any two out of three has no more than one" AND "it has at least one". To express "2 out of 3", we can use the opposite of "1 out of 3"

The information is: $C_1 \wedge C_2$ where

$C_1 = \neg (a_1 \wedge a_2) \wedge \neg (a_2 \wedge a_3) \wedge \neg (a_1 \wedge a_3) \wedge (a_1 \vee a_2 \vee a_3)$

$C_2 = (a_2 \vee a_3) \wedge (a_3 \vee a_4) \wedge (a_2 \vee a_4) \wedge \neg (a_2 \wedge a_3 \wedge a_4)$

By some common sense, or SAT solvers, we will eventually get $a_1$ being false (no mine) and $a_4$ being true. No solutions for $a_2, a_3$? No worries. As we flip grid 1 and check grid 4, we get definite answers as we opened those grids. It's now reduced to example 1 but with extra clauses that we can solve it. (we CAN because the new clause is a single variable!)

Technically, a minesweeper board with $n$ grid is a SAT question $C_1 \wedge ... \wedge C_n$, where each clause represent the information from a single grid. Of course, if a grid contains a mine, then it gives no information but also automatically satisfied, then the clause is empty.

Of course, we won't have all the clauses on our hands, but let's not forget the fact that clauses are made up based on a fixed board, and hence the clauses can be satisfied. Since the clauses are solvable, if a variable can be solved from the partial clauses then it also satisfies the overall clauses, which enable us to reveal more information and eventually solve the whole board. Instead of determining solvability, our question is: are there MULTIPLE solutions towards the SAT question?

Definitely there are setups with multiple solutions. One stupid example:

Example 3. $\begin{matrix} \blacksquare &\blacksquare &\blacksquare \\ \blacksquare & (1) & \blacksquare \\ \blacksquare & \blacksquare & \blacksquare\end{matrix}$

Information: $()$. This is damn independent of $a_1$ !!!!

Example 4. $\begin{matrix} \blacksquare &2&2&\blacksquare \\ 2&(1)&(2)&2\\2&(3)&(4)&2\\\blacksquare &2&2&\blacksquare\end{matrix}$

The information is $(a_1 \bigoplus a_2) \wedge (a_2 \bigoplus a_4) \wedge (a_3 \bigoplus a_4) \wedge (a_1 \bigoplus a_3)$. After some simplification the answer is $(a_1 \leftrightarrow a_4) \bigoplus (a_2 \leftrightarrow a_3)$, which clearly brings you two solutions!

Our question is, how often that a board will give us situations that can't be deterministically solved? We have to formalize this question though:

Can we ask that if all clauses are provided? No as each empty clause represents a grid to be checked. Instead, we state the following algorithm:

def mssolver(grid, initpos = null):
if initpos not null start from the assigned position, otherwise start on a random grid
while not all grids opened/checked:
    Construct clauses from given information
    Solve the clauses and open/check ensured grids
    if none of the grids are opened in this loop, reveal a grid randomly
return grid

I seldom leave open questions here, but SAT solvers are beyond my scope and I have other topics to deal with, but here are the questions:

Assuming that for a board, each grid has a probability $p$ to contain a mine.

Question 1. Given an infinite board (so that the initial position does not matter), how often do we have to open a grid blindly? (we can count the number of loops taken, which divides the number of times that we open a random grid.) Moreover, how often will we open a grid wrongly?

Question 2. What if we work on finite rectangular grid? Does the initial position matters?

Question 3. What if the minesweeper structure is changed? In particular, in the hexagon board?

It will be nice if there are some simulation for this one. I think that the probability of getting wrong is less than that of a finite grid because we can go further and as we go further we have possible more information so that the chance that all information is used up is smaller.

Are there any further optimization? Surely we can do some probabilistic analysis? Not sure if I have the time to deal with it, but I will try my best. Before I end this entry I'll simply throw an example that demonstrates how probabilistic analysis works. Ask yourself why about that! ;)

Example 5. $ \begin{matrix} \square & 4& \square & \blacksquare \\ \square &\square &\square & \blacksquare \\ \blacksquare  & 7(!!) &\square &3\\ \blacksquare & \blacksquare &\square & \square \end{matrix}$

Thursday, 8 August 2013

Note transfer completed

So, here we go.

Notes are transferred to http://forretrio.t15.org/ . New instructions has been added to the heading of the site.

The junior section are referred to be local post within this blog. Some of them looks messy, and upon request I will repair them, while the senior section are now downloadable.

If there are any further questions comment here. Otherwise, enjoy.

Wednesday, 10 July 2013

Concerning the notes

Currently my ftp site got demolished without any notice --- that's quite hash for me, and my site.

I'm going to find a new host for them as soon as possible, and I hope this can be done within the entire month. This won't be an easy task --- I have no backup in my current computer and I'm afraid that some of the junior form subject notes has been lost --- I will find them with my best. If they are not found I'll try to compensate it through other means.

FTP hosts are getting less common these days. We have experienced the evolution process of online social platform --- MSN community (late 90s) --- geocities (90s - early 10s) --- forums (00s) --- blogs (00s) --- and facebook. As written in the book Zero, the future of a radical price professionalism declines on the web --- and they evolves into more integrated platform. Individual sites lost its commercial potential when people use facebook --- and now no one supplies free FTP host. This will be a long and sad story but let's keep this piece of entry be an announcement rather than complaints.

I planned to make a full site connecting to this blog with the notes, but that would absolutely be a great challenge for me, and it takes quite a lot of time.

What should you do if you want to retrieve the notes?

1) Take the web version from this blog.
2) Serior form notes are hosted here temperorily: http://forretrio.comli.com/ I'm sorry for the stupid advertisment. I'll find a new host ASAP, as mentioned.

Thanks for supporting my blog, and if you have any enquiries put it in the ask.fm (on the right hand side), or comment anywhere in my site. To make this entry 'sticky' I'll not write any new stuff until the problem is solved.

11/7/2013

Wednesday, 3 April 2013

Friendwheel

Errrrrr three months from my last entry...

Anyway if you haven't heard of friendwheel, a facebook apps that generates the graph of friendship links of all your friends:


(I blurred the names of my friends for privacy. These blurred pixel looks awesome right?)

In general, they put similar friends closer (this is how the wheel works --- grouping in similar colour)

However as you can see some of the points are emitting a lot of linkage far from the neighbouring group. It turns out that their alogarithm is not very effective.

Definition. Distance between two points are defined by (number of points between them)+1

Definition. Circular n-tuple (a_1,...,a_n) is n numbers in order, and a_1 is supposed to be the next term of a_n. Due to the cyclic structure, it is equivalent to (a_2,a_3,...a_n, a_1) and so on.

Definition. Given a graph G = (V,E), the optimal permutation is a circular |V|-tuple where the tuple is the permutation of vertices so that sum of distance between any two linked vertices is the minimal.

Definition Given a graph G = (V,E). Optimal distance is the sum of distance between any two linked vertices.

The optimal permutation is not necessarily unique. For example consider:

G = (V,E); V = {1,2,3,4}, E = {(1,2),(3,4)}, then both (1,2,3,4) and (1,2,4,3) are optimal permutations with optimal distacnce 2.

Now the problem is, how should we reduce the sum of distance given G and an arbitrary permutation of V? I'll propose a heuristic here:

begin;
local_min_reached = FALSE
while not local_min_reached:
---swapping = FALSE
---for i in 1 to n:
------check if swapping a_i and a_{i+1} reduces sum of distance
------if yes then swap the two vertices
------swapping = TRUE
---if swapping = FALSE:
------local_min_reached = TRUE
return permutation of edges.

I'm so lazy to calculate its preformance ratio as well as other stuffs since it would take me quite a lot of time to do the simulation (like my research topic), but I'm sure that this one is qutie near to the optimal solution. I have no idea to construct G so that the procedure stuck at local optimum and fail to reach the global optimum with the above alogarithm.

Any idea?

Wednesday, 23 January 2013

Transformation

Happy new year.

Just a random passage appreciating one of the common techniques...