Saturday 19 June 2021

FEH VG predictor continued: wave pattern and early estimation

Building a model for VG is something that I wanted to do for a long time. In the previous article I wrote about the basics of a VG model and the article concluded with the chart below:


The perfect wave in the first 12 hours caught my eyes -- is that a coincidence or is that a general phenomenon? The aim of this article is to look into further patterns that help us to build the model. Before we start recall the terms that I used in the previous article -- please refer to the previous article for further details.

- Three examples all extracted from VG June 2021. Please refer to the previous article for further details. You can extract the raw data from the Japanese predictor made by @rammtiger_n. 

Example 1: Final (Popularity ratio >4)
Example 2: Quarterfinal 4 (Popularity ratio 1~2)
Example 3: Semifinal 2 (Popularity ratio close to 1)

- Parameter $k$: the parameter so that the accumulated score is of order $O(t^{2+k})$, or that the team's activity is of order $O(t^{1+k})$. To be more precise, for team i (i = 1,2) define $c_i(t)$ to be the constant factor which scales upon team size, and switch between three values according to the state of the hour, and $f_i(t)$ is the corresponding hour multiplier (which can either be $1.05+0.05t$ or $3.2+0.2t$). Ignoring intraday variation we assume that the team activity $A_i(t)$ is approximated by $c_i(t) f_i(t)^{1+k}$.

One should note that this parameter for the two teams are not necessarily the same, but they are close enough for most of the time. Let us assume that parameter $k$ is uniform across the two teams first.

*

The chart showed at the beginning is what happened in example 3. The curves are easily spotted because it is a perfect ping-pong where activity of the two teams are almost equal. At the same time when a team is in the excited state the other must be in the post-excitement state as it is exhausted due to bonus multiplier at the previous hour. As a result we find two perfect curves with alternating dots, one for the activity at excited state, another one for the activity at post-excitement state.

We do not have a perfect ping-pong most of the time, so are there any ways to extract such trend if it exists? One approach is to assign a factor to the three states: we may assume that the normalized activity in the excited state is 10 times the normal activity and 100 times of the post-excitement activity. Although we can explain this by the fact that flags comes in a multiplier of 100, such ratio is still affected by the parameter $k$, which we do not want to fix. 

There is a smarter way to get around this: observe that the state of the two teams are almost always excited + post-excitement or normal + normal. On rare occasions it could be normal + excited or normal + post-excitement but they always cancel out. Therefore we can simply take the (geometric) average of the (normalized) activity to retrieve the trend!

Mathematically, we first guessed that the parameter to be $k_0$. We then normalize the activity by considering $A_i(t)/(f_i(t))^{1+k_0}$. By taking the geometric mean we have that
$GM = (c_1(t)c_2(t))^{1/2}(f_1(t)f_2(t))^{(k-k_0)/2}$.
If we are either in the excited + post-excitement or normal + normal states, then $\sqrt{c_1(t)c_2(t)} = c_S$ is a constant. Since $f_1(t)f_2(t)$ is always $\Theta (t^2)$, we know that the geometric mean is constant (or regressed to be constant) if and only if $k=k_0$, i.e., if the estimated parameter $k_0$ meets the true parameter. We take log GM instead of GM to even out the impact of normal + excited states against normal + post-excitement states.

As a demonstration we calculate the log-geometric mean team activity for example 1 we get the following chart (with the guess of $k=1$):


We can see a downward trend starting from hour number 8, indicating that $k=1$ is an overestimate here. 

Again we retrieved the same early wavy pattern as in the first chart. It has a simple explanation: in FEH there are quests to clear. You need to clear these simple quests to get the (maximum number of) flags. The quests are mostly "clear VG with red/blue/green/colorless unit", but they require you to enter VG actually. On the other hand, you start the event with zero vote so you cannot do these quests right away. Most people do these quests with votes almost fully restored, which is exactly 4-8 hours into the event. 

Now we can estimate $k$ by removing the first 4 hours as outliers and search for $k_0$ such that the linear regression returns a zero slope. Since the regressed slope is strictly decreasing with $k_0$ we can always find such $k_0$.

If we apply that on example 1 we estimate $k$ to be 0.85: 


And if we apply that on example 2 we estimate $k$ to be 1.17:


The wavy pattern seems to be very consistent among all situations: we always observe two peaks, one at hour number 4 (which corresponds to 8 hours into the event since we removed the first four) and another one at hour number 12 (16 hours into the game). We may interpret these as the activity peak from players in different part of the world. Computationally the peak and troughs helps us greatly in the sense that we can do the same linear regression using the first two peaks and troughs, i.e., the data of the first 20 hours, and the result is highly correlated to the estimate using all 44 hours of data.


Example 1: $k$ estimated to be 0.8 with the data of hour number 5~20 vs 0.8 on global data


Example 2: $k$ estimated to be 0.92 with the data of hour number 5~20 vs 0.92 on global data

It seems that such estimation is always an underestimate due to (out-of-correlation) increased activity at the far end, but we can always add a little bit to our estimate. 

*

So, what can we do with the predictor now? This is a purposed way of creating a prediction:

- Use the early data to estimate the constant factor for teams' activity with $k_0=1$
- Predict by combining team activity and states guessing
- Analyze team composition by wave decomposition at hour number 20 and modify $c_i(t)$ accordingly
- Update $k_0$ by linear regression every time before iterating through the prediction after hour number 12 or 20
- ???
- Feathers!

As much as the above being a big and serious discussion, I still prefer participating the event in a simple way by guessing frequency of the bonus hours linearly. A 99% accurate predictor? Sure but no thanks if I am the one to write the codes. Not to mention that it is actually quite hard to measure the error in a dynamic system and we just can't tell in a mathematically rigorously way that how accurate our predictor could be...


The charts were not properly imported onto google drive, but you can plot them easily. Column L-N are time-normalized difference and bonus boundaries with $k=1$. Column T is the log-geometric mean of team activities. You can change $k$ as you like at W4 and W5, but the $k$ for the two teams are by default equal. The three labels are SF2, F and QF4 which correspond to examples 3, 1 and 2 respectively.

Friday 11 June 2021

Some optimization on the FEH VG predictors

Voting Gauntlets in FEH is always controversial in many ways. In terms of outcome some hates to see popular characters always triumph over ordinary characters while the rest complains how the result is unpredictable and favors the chasing side by so much. In terms of reward some players are unhappy about the lack of rewards -- well actually 12 orbs is a lot, but the feathers are also very friendly to new players (and even me back in the days). In terms of difficulty, some found that playing with 3 random characters is quite fun, but some say they have terrible luck and facing 3 fallen Edgelord is bullshit.

But today I want to talk about mathematics and not the game mode itself. How can we predict the outcome given the first few/12/24 hours of data? Certainly there are a few attempts already: on Reddit there are a few predictors on the West and also one from Japan. I found that the interface of the Japan predictor is pretty nice, despite that the prediction is sometimes off. 

In the past I have talked about VG in the sense of a multiplayer game -- in the game theory sense, but this article is doing the complete opposite. We assume that the reaction is fixed under some unknown parameters, and the goal is to build a model out of that.

I do not plan to build a predictor by myself. It takes lots of time and does not benefit one so much ingame: in terms of ranks it makes no difference if the bonus hour shifts as everyone has the same bonus time. You can almost always get the highest reward by not missing the bonus hours in the last 20-24 hours, which can be done using VG bots. The prediction for the last few games where final result matters, can be predicted fairly accurately by most models anyway. 

The reason I wanted to write this is because there are a few things that I spotted that are relevant but they were not accounted in existing models, so it serves more as an investigation.

For starters, these are what you need to know: (FEH and negligible details are cropped, just to give a sufficient model here)
- Two teams undergo a head-to-head battle over 44 hours.
- Every player has a voting gauge which recovers by 1 vote per hour and is capped by 8. 
- Every player has 2000 flags which they can spend over the battles. One may spend a maximum of 100 flags per vote that they applied. With N flags applied the score is multiplied by N. For example if one spends 8 votes with 800 flags then the score is multiplied by 800. If no flags were spent the multiplier is 1 per vote.
- There are two multipliers: the normal multiplier starts from 1.1x and increase by 0.05x per hour. The bonus multiplier starts from 3.4x and increase by 0.2x per hour.
- The score is updated every hour. If a team is 1% more than the other, bonus multiplier will be triggered for the weaker team during the hour. 
- The team with higher score at the end wins.

Score normalization

First of all, we know that the score is not growing linearly and we need a way to normalize them for a time-invariant comparison. A clear choice would be the direct score ratio between the two teams. This is a very intuitive choice which also hooks with the bonus trigger, but the accumulated score certainly affects the velocity of this indicator over time.

Another choice is to divide the score by the multiplier, whether it's the ordinary or bonus multiplier does not matter too much because that is just a constant scaling (almost). The problem is that player's VG activity isn't constant either: higher multipliers are expected at the end so they prefer to spend flags towards the end. With flags around the score obtained by not spending flag is basically negligible. We need to capture when people spend flags.

With everything being non-linear it is so hard to decide the right exponent, so I decided to look at the accumulated score instead. It is natural to assume that players' activity is non-decreasing in general, then their points gained per hour, after enlarged by the linearly growing multiplier, is at least linear. As a result, the accumulated points are at least quadratic, i.e., $\Omega (t^2)$. 

Assume that the players' activity -- or the teams' activity as a whole, is of order $O(t^{1+k})$ then the accumulated score will be of order $O(t^{2+k})$. It is not hard to find that the accumulated score is indeed at the order $O(t^{2+k})$ for some small $k$ -- so let us just divide everything by $t^2$ before we look into the parameter $k$. 

Here are the two typical examples taken from VG 2021 June.

Example 1: VG final (F!Corrin vs Klein)


Example 2: VG quarterfinal 4



These are two typical matches in VG: example 1 is when the popularity of one clearly overwhelms the other, while example 2 happens when a team is of significantly higher popularly but not as extreme as example 1.

All the charts are time-normalized by $t^2$ where $t$ is the average of the two multipliers.

In the first chart, the orange and yellow line indicates the time-normalized boundary for bonus multipliers, while the blue line shows the normalized score difference. The second chart indicates the normalized score activity of the first team (a positive score difference means that the first team is leading). 

We expect the normalized player activity should be of order $O(t^{k-1})$, and from here we can estimate $k$. The spikes are when bonus multiplier happens.

We can see that the parameter $k$ clearly varies in different situation. We can assume that $k$ is close to 1 in example 1 while $k$ is clearly much smaller than 1 in example 2. In fact, $k=0.2$ is a pretty good estimate. The $k$ value can also be verified by checking the growth rate of the bonus boundary curve.

We can explain the correlation by how player anticipate the battle towards the end instead of casually spending their flags in the middle. Very interestingly the parameter $k$ seems to be independent of the ratio of player base size: the dominant a team is, the more bonus hours the opposite team will get. So in theory if the parameter is decided by the frequency of bonus hours, then the parameter for the two teams should be different, but that is not the case here. If we plot the activity of both teams on the same chart for example 1, we can see that the parameter for the two teams are more or less equal.



Calculating the parameter $k$ would be extremely helpful because we can then get a normalized data. (And we will cover that in the sequel of this article!)

The three states of players

With the above graph we see that the activity of the players divide into three categories, or three states that we call.

A team is in an excited state if it receives a long-waited bonus hour, or a bonus hours that shoots them to the leading position. The activity clearly spikes for this hour.

A team is in a post-excitement state if the bonus hour in excited state shoots them into leading position with bonus triggered on the opponent then the team enters the post-excitement state where activity is abnormally low because they run out of votes and flags.

A team is in a normal state otherwise. There are more fluctuation within this state depending on the score situation. 

We can plot the same for example 2 which is a lot more chaotic, but the activity is still clearly divided into the three states as described.


The pattern shows that most players aren't playing to optimize the chance to win as a team. Rushing to overtake the opponent early has little to no effect on the final result but looks good in a team sport, while the increased activity when a team enters bonus multiplier is natural as it maximizes score gain (for those who urgently need to spend flags).

I am actually quite surprised that this was not taken into account by most predictors, as it plays an important role at the end where people react to the hourly updated results vigorously. For example the JP predictor predicted a constant downward ping-pong at the end for example 1 (i.e., bonus for second team for 1 hour + no bonus for both teams for 1 hour alternately and these two together results in favor of the second team), but the reaction from the second team at hour 42 should be much more violent as it sends the team into leading position -- and of course the first team hit them back with 10x the power. This is the nature of VG predictor: players react to score but not team result.

Daily variation on player activity

Clearly players do not stay awake 24/7 for these shitty reward (well even if orbs are worth its monetary cost that is merely a burger meal, so it's not worth the time to stay awake overnight), so they stop playing when they are asleep. But people over the world situates in different time zone and they sleep over different time. More importantly, the taste from different part of the world seems to be different. As a result, we may observe higher activity from one team during daytime then higher activity from the other team during nighttime (which could then be daytime for that part of the world).

Assuming that the main division would be Japanese (or Asian) players vs the West, we can divide the players into 3 groups: time-invariant players, Asian players and Western players. We estimate the portion of the three and how supportive they are to each team. This is taken as a scaling factor when we predict the future outcomes. 

And how do we do that? Well this is simple linear algebra -- these three groups of players can be modelled into 3 kinds of waves: constant, sine and cosine waves, orthogonal to each other. We can then apply orthogonal projection to estimate the portion from each of the three. 

*

I really believe that the three factors together with what we already have around, builds a very accurate VG predictor, but surely no one would waste the time doing that.

To conclude this article let me show the graph plotted for the same VG but semifinal 2, which is ping-pong all the way. I stacked (apology for the poor stacking) the two graphs together so that you can observe the interaction more clearly. Since it's a perfect ping-pong first the first 20 hours or so, you can see that the activities alternates from excited and post-excitement states but not the normal state. Although the waving pattern in the first 20 hours which in fact, also occurred in the first two examples, seems to raise more questions from here...


(To be continued)

Acknowledgements: raw data extracted from the Japanese predictor by @rammtiger_n

Wednesday 2 June 2021

夢醒

The pride of London. Captured from ChelseaTV.


九年前,一個偏僻的小鎮上。

那裡的網絡居然還是ADSL而且每個月只有20GB限制。早就把流量吃光的我只好去客廳把星期天早上的電視機霸佔起來。

我跪在電視前,距離之近甚至可以感受到屏幕的靜電。

從加時賽開始就跪在那,大腦在拜仁獲得十二碼時早已當機了吧,回過神來兩隊已經打完一百二十分鐘要互射十二碼決勝負了。

不知為何,就算馬達射失第一個十二碼我也沒有那種會像莫斯科雨夜那樣飲恨的感覺。隨後大衛雷斯的爆射右上角、林伯特特地改掉習慣正上方射進、艾殊利高爾右中貼網側射進也印證了這一點。

杜奧巴不負眾望打進致勝十二碼的一刻,他第一時間找到施治相擁而哭,筋疲力盡的車路士球員紛紛跪地流淚。然後同一時間我竟然也哭了,趴倒在電視機前放聲哭泣--在此之前我似乎從而因為體育而流淚。零八零九那兩場看完臉一黑就回去睡了,啊頂多就以後把那裁判拿出來鞭,當時在那邊哭又有甚麼用呢?

房子的主人的小孩問道:為甚麼要哭呢?你所愛的球隊不是贏了嗎?我仍然在那邊流淚沒法回答。小孩看看我又看看其他大人,一臉不解地跑開了,真是奇怪的少年。

對啊,為甚麼呢?

如果要我回答的話,流淚是屬於勝利者的專利。很多人未必會認同這一看法,但我相信至少對車路士這種性格的球隊來說沒錯。球賽可以輸,學費也可以交,唯獨魂不可以丟。只有在勝利那一刻你的努力付出才化為事實,那時候你想怎樣都可以,包括流下喜悅的淚水。

所以九年後的歐冠決賽,我又哭了。

車路士在九年前那個歐冠之後開始迷失,連兩年對上大巴黎先勝後敗後完是在歐冠賽場上失魂落魄;去年球隊更是吃到轉會禁令,在無人可買的情況下保四岌岌可危。但這時候林伯特,那個九年前對巴塞次回合在上半場結束前搶斷美斯然後向拉米雷斯長傳助攻的林伯特--站了出來,以教練的身份帶領一幫青年軍保住了第四。在疫情肆虐下轉會禁令忽然使車路士成為受益的一方,最終由杜曹接手將這幫青年軍帶向歐洲之巔。

當年的淚水反映的是十六強開始的各種打不死,外加零八零九兩場飲恨的結果;這次流的淚更多是這些年來的低迷而忽然在短時間的連續巧合下奪下冠軍的落差。兩者雖然性質不同,曲折程度其實難分高下。

與上次奪冠異曲同工的還有球隊立足於歐冠的防守。比起上次的力挽狂瀾這次更多是波瀾不驚:淘汰賽七場只失兩球,一球是次回合領先兩球第九十四分鐘失的倒鉤世界波,另一球則是領先被追平的不落地側身窩利。兩球都是靠個人能力與運氣射進的高難度入球(也入選本季歐冠最佳進球),而且對局面幾乎無實質性損傷。整個賽事僅失四球為史上最少,單是這項數據就能讓我這種防守控十分滿足。

要數兩次奪冠相近之處根本數不完。比如兩隊都充滿了剛進隊不久,沒大賽經驗也沒被寄予厚望的新兵;又或者兩次都是剋死西班牙球隊上位(上次小組賽最後一輪生死戰三比零斬落黃色潛艇)等等……但不變的是,車路士還是那支堅忍的鐵血球隊。只要這點不變,我就會繼續支持下去。

喔不過雖然球隊我十分支持,但是Nike的球衣很醜,我支持不下去。如果要我選一個球員印在背面的話……我應該會選11號吧?他值得如同我們對托利斯一樣的喜愛。