Wednesday 25 September 2013

Thoughts on minesweeper

Minesweeper has been a classic for those who had been using the Window OS, and those who born in the MSN era will not forget the minesweeper battle (and I wrote a few entries about minesweeping in the past, in Chinese)

For those who don't know what is minesweeper, I seriously recommend you to have a go for it. We can define the minesweeper formally:

Initial status: a rectangular matrix with all grids closed.
State of grids:
1) Closed.
2) Opened: showing a number, which equals to the number of mines in the 8 neighboring grids (5, or 3 if grid is on the boundary).
3) Checked: the grid is marked to contain a mine. (No grids will contain more than one mine) Also, assume wrongly opened grid to be checked.
Objective: Turn all closed grids into either opened or checked CORRECTLY.

Minesweeper sometimes depends on luck, but it depends on logical deduction as well. I won't go deep into the logical stuffs here but here we analysis the principle behind logical deduction.

Just as what we learn from computation complexity, every NP-complete question is as hard as the 3-SAT (satisfibility) question. The minesweeper game is not necessarily a NP-complete question but we can still turn them into SAT question.

For the following question, we use white square or bracketed numbers for closed grids, black square for checked grids, and numbers for opened grids. Assume the boundary of the given grid is the true boundary of the board unless otherwise specified.

In our example, any bracketed number $(n)$ is equivalent to the boolean variable $a_n$, which is true if it has a mine, or zero otherwise. For SAT questions, clauses are "OR" inside and "AND" outside, but we can but too lazy to turn them into a standard SAT format.

Example 1. $\begin{matrix} (1) & (2) \\ 1&1\end{matrix}$

The information given from the two '1' s are equivalent, and it this $a_1 \bigoplus a_2$.

Example 2. $\begin{matrix} (1) & (2) & (3) & (4) \\ \blacksquare & 2 & 3&\blacksquare \end{matrix}$

How do we express "1 out of 3" with binary boolean operator? The solution is "any two out of three has no more than one" AND "it has at least one". To express "2 out of 3", we can use the opposite of "1 out of 3"

The information is: $C_1 \wedge C_2$ where

$C_1 = \neg (a_1 \wedge a_2) \wedge \neg (a_2 \wedge a_3) \wedge \neg (a_1 \wedge a_3) \wedge (a_1 \vee a_2 \vee a_3)$

$C_2 = (a_2 \vee a_3) \wedge (a_3 \vee a_4) \wedge (a_2 \vee a_4) \wedge \neg (a_2 \wedge a_3 \wedge a_4)$

By some common sense, or SAT solvers, we will eventually get $a_1$ being false (no mine) and $a_4$ being true. No solutions for $a_2, a_3$? No worries. As we flip grid 1 and check grid 4, we get definite answers as we opened those grids. It's now reduced to example 1 but with extra clauses that we can solve it. (we CAN because the new clause is a single variable!)

Technically, a minesweeper board with $n$ grid is a SAT question $C_1 \wedge ... \wedge C_n$, where each clause represent the information from a single grid. Of course, if a grid contains a mine, then it gives no information but also automatically satisfied, then the clause is empty.

Of course, we won't have all the clauses on our hands, but let's not forget the fact that clauses are made up based on a fixed board, and hence the clauses can be satisfied. Since the clauses are solvable, if a variable can be solved from the partial clauses then it also satisfies the overall clauses, which enable us to reveal more information and eventually solve the whole board. Instead of determining solvability, our question is: are there MULTIPLE solutions towards the SAT question?

Definitely there are setups with multiple solutions. One stupid example:

Example 3. $\begin{matrix} \blacksquare &\blacksquare &\blacksquare \\ \blacksquare & (1) & \blacksquare \\ \blacksquare & \blacksquare & \blacksquare\end{matrix}$

Information: $()$. This is damn independent of $a_1$ !!!!

Example 4. $\begin{matrix} \blacksquare &2&2&\blacksquare \\ 2&(1)&(2)&2\\2&(3)&(4)&2\\\blacksquare &2&2&\blacksquare\end{matrix}$

The information is $(a_1 \bigoplus a_2) \wedge (a_2 \bigoplus a_4) \wedge (a_3 \bigoplus a_4) \wedge (a_1 \bigoplus a_3)$. After some simplification the answer is $(a_1 \leftrightarrow a_4) \bigoplus (a_2 \leftrightarrow a_3)$, which clearly brings you two solutions!

Our question is, how often that a board will give us situations that can't be deterministically solved? We have to formalize this question though:

Can we ask that if all clauses are provided? No as each empty clause represents a grid to be checked. Instead, we state the following algorithm:

def mssolver(grid, initpos = null):
if initpos not null start from the assigned position, otherwise start on a random grid
while not all grids opened/checked:
    Construct clauses from given information
    Solve the clauses and open/check ensured grids
    if none of the grids are opened in this loop, reveal a grid randomly
return grid

I seldom leave open questions here, but SAT solvers are beyond my scope and I have other topics to deal with, but here are the questions:

Assuming that for a board, each grid has a probability $p$ to contain a mine.

Question 1. Given an infinite board (so that the initial position does not matter), how often do we have to open a grid blindly? (we can count the number of loops taken, which divides the number of times that we open a random grid.) Moreover, how often will we open a grid wrongly?

Question 2. What if we work on finite rectangular grid? Does the initial position matters?

Question 3. What if the minesweeper structure is changed? In particular, in the hexagon board?

It will be nice if there are some simulation for this one. I think that the probability of getting wrong is less than that of a finite grid because we can go further and as we go further we have possible more information so that the chance that all information is used up is smaller.

Are there any further optimization? Surely we can do some probabilistic analysis? Not sure if I have the time to deal with it, but I will try my best. Before I end this entry I'll simply throw an example that demonstrates how probabilistic analysis works. Ask yourself why about that! ;)

Example 5. $ \begin{matrix} \square & 4& \square & \blacksquare \\ \square &\square &\square & \blacksquare \\ \blacksquare  & 7(!!) &\square &3\\ \blacksquare & \blacksquare &\square & \square \end{matrix}$