Saturday 19 June 2021

FEH VG predictor continued: wave pattern and early estimation

Building a model for VG is something that I wanted to do for a long time. In the previous article I wrote about the basics of a VG model and the article concluded with the chart below:

The perfect wave in the first 12 hours caught my eyes -- is that a coincidence or is that a general phenomenon? The aim of this article is to look into further patterns that help us to build the model. Before we start recall the terms that I used in the previous article -- please refer to the previous article for further details.

- Three examples all extracted from VG June 2021. Please refer to the previous article for further details. You can extract the raw data from the Japanese predictor made by @rammtiger_n. 

Example 1: Final (Popularity ratio >4)
Example 2: Quarterfinal 4 (Popularity ratio 1~2)
Example 3: Semifinal 2 (Popularity ratio close to 1)

- Parameter $k$: the parameter so that the accumulated score is of order $O(t^{2+k})$, or that the team's activity is of order $O(t^{1+k})$. To be more precise, for team i (i = 1,2) define $c_i(t)$ to be the constant factor which scales upon team size, and switch between three values according to the state of the hour, and $f_i(t)$ is the corresponding hour multiplier (which can either be $1.05+0.05t$ or $3.2+0.2t$). Ignoring intraday variation we assume that the team activity $A_i(t)$ is approximated by $c_i(t) f_i(t)^{1+k}$.

One should note that this parameter for the two teams are not necessarily the same, but they are close enough for most of the time. Let us assume that parameter $k$ is uniform across the two teams first.


The chart showed at the beginning is what happened in example 3. The curves are easily spotted because it is a perfect ping-pong where activity of the two teams are almost equal. At the same time when a team is in the excited state the other must be in the post-excitement state as it is exhausted due to bonus multiplier at the previous hour. As a result we find two perfect curves with alternating dots, one for the activity at excited state, another one for the activity at post-excitement state.

We do not have a perfect ping-pong most of the time, so are there any ways to extract such trend if it exists? One approach is to assign a factor to the three states: we may assume that the normalized activity in the excited state is 10 times the normal activity and 100 times of the post-excitement activity. Although we can explain this by the fact that flags comes in a multiplier of 100, such ratio is still affected by the parameter $k$, which we do not want to fix. 

There is a smarter way to get around this: observe that the state of the two teams are almost always excited + post-excitement or normal + normal. On rare occasions it could be normal + excited or normal + post-excitement but they always cancel out. Therefore we can simply take the (geometric) average of the (normalized) activity to retrieve the trend!

Mathematically, we first guessed that the parameter to be $k_0$. We then normalize the activity by considering $A_i(t)/(f_i(t))^{1+k_0}$. By taking the geometric mean we have that
$GM = (c_1(t)c_2(t))^{1/2}(f_1(t)f_2(t))^{(k-k_0)/2}$.
If we are either in the excited + post-excitement or normal + normal states, then $\sqrt{c_1(t)c_2(t)} = c_S$ is a constant. Since $f_1(t)f_2(t)$ is always $\Theta (t^2)$, we know that the geometric mean is constant (or regressed to be constant) if and only if $k=k_0$, i.e., if the estimated parameter $k_0$ meets the true parameter. We take log GM instead of GM to even out the impact of normal + excited states against normal + post-excitement states.

As a demonstration we calculate the log-geometric mean team activity for example 1 we get the following chart (with the guess of $k=1$):

We can see a downward trend starting from hour number 8, indicating that $k=1$ is an overestimate here. 

Again we retrieved the same early wavy pattern as in the first chart. It has a simple explanation: in FEH there are quests to clear. You need to clear these simple quests to get the (maximum number of) flags. The quests are mostly "clear VG with red/blue/green/colorless unit", but they require you to enter VG actually. On the other hand, you start the event with zero vote so you cannot do these quests right away. Most people do these quests with votes almost fully restored, which is exactly 4-8 hours into the event. 

Now we can estimate $k$ by removing the first 4 hours as outliers and search for $k_0$ such that the linear regression returns a zero slope. Since the regressed slope is strictly decreasing with $k_0$ we can always find such $k_0$.

If we apply that on example 1 we estimate $k$ to be 0.85: 

And if we apply that on example 2 we estimate $k$ to be 1.17:

The wavy pattern seems to be very consistent among all situations: we always observe two peaks, one at hour number 4 (which corresponds to 8 hours into the event since we removed the first four) and another one at hour number 12 (16 hours into the game). We may interpret these as the activity peak from players in different part of the world. Computationally the peak and troughs helps us greatly in the sense that we can do the same linear regression using the first two peaks and troughs, i.e., the data of the first 20 hours, and the result is highly correlated to the estimate using all 44 hours of data.

Example 1: $k$ estimated to be 0.8 with the data of hour number 5~20 vs 0.8 on global data

Example 2: $k$ estimated to be 0.92 with the data of hour number 5~20 vs 0.92 on global data

It seems that such estimation is always an underestimate due to (out-of-correlation) increased activity at the far end, but we can always add a little bit to our estimate. 


So, what can we do with the predictor now? This is a purposed way of creating a prediction:

- Use the early data to estimate the constant factor for teams' activity with $k_0=1$
- Predict by combining team activity and states guessing
- Analyze team composition by wave decomposition at hour number 20 and modify $c_i(t)$ accordingly
- Update $k_0$ by linear regression every time before iterating through the prediction after hour number 12 or 20
- ???
- Feathers!

As much as the above being a big and serious discussion, I still prefer participating the event in a simple way by guessing frequency of the bonus hours linearly. A 99% accurate predictor? Sure but no thanks if I am the one to write the codes. Not to mention that it is actually quite hard to measure the error in a dynamic system and we just can't tell in a mathematically rigorously way that how accurate our predictor could be...

The charts were not properly imported onto google drive, but you can plot them easily. Column L-N are time-normalized difference and bonus boundaries with $k=1$. Column T is the log-geometric mean of team activities. You can change $k$ as you like at W4 and W5, but the $k$ for the two teams are by default equal. The three labels are SF2, F and QF4 which correspond to examples 3, 1 and 2 respectively.

Friday 11 June 2021

Some optimization on the FEH VG predictors

Voting Gauntlets in FEH is always controversial in many ways. In terms of outcome some hates to see popular characters always triumph over ordinary characters while the rest complains how the result is unpredictable and favors the chasing side by so much. In terms of reward some players are unhappy about the lack of rewards -- well actually 12 orbs is a lot, but the feathers are also very friendly to new players (and even me back in the days). In terms of difficulty, some found that playing with 3 random characters is quite fun, but some say they have terrible luck and facing 3 fallen Edgelord is bullshit.

But today I want to talk about mathematics and not the game mode itself. How can we predict the outcome given the first few/12/24 hours of data? Certainly there are a few attempts already: on Reddit there are a few predictors on the West and also one from Japan. I found that the interface of the Japan predictor is pretty nice, despite that the prediction is sometimes off. 

In the past I have talked about VG in the sense of a multiplayer game -- in the game theory sense, but this article is doing the complete opposite. We assume that the reaction is fixed under some unknown parameters, and the goal is to build a model out of that.

I do not plan to build a predictor by myself. It takes lots of time and does not benefit one so much ingame: in terms of ranks it makes no difference if the bonus hour shifts as everyone has the same bonus time. You can almost always get the highest reward by not missing the bonus hours in the last 20-24 hours, which can be done using VG bots. The prediction for the last few games where final result matters, can be predicted fairly accurately by most models anyway. 

The reason I wanted to write this is because there are a few things that I spotted that are relevant but they were not accounted in existing models, so it serves more as an investigation.

For starters, these are what you need to know: (FEH and negligible details are cropped, just to give a sufficient model here)
- Two teams undergo a head-to-head battle over 44 hours.
- Every player has a voting gauge which recovers by 1 vote per hour and is capped by 8. 
- Every player has 2000 flags which they can spend over the battles. One may spend a maximum of 100 flags per vote that they applied. With N flags applied the score is multiplied by N. For example if one spends 8 votes with 800 flags then the score is multiplied by 800. If no flags were spent the multiplier is 1 per vote.
- There are two multipliers: the normal multiplier starts from 1.1x and increase by 0.05x per hour. The bonus multiplier starts from 3.4x and increase by 0.2x per hour.
- The score is updated every hour. If a team is 1% more than the other, bonus multiplier will be triggered for the weaker team during the hour. 
- The team with higher score at the end wins.

Score normalization

First of all, we know that the score is not growing linearly and we need a way to normalize them for a time-invariant comparison. A clear choice would be the direct score ratio between the two teams. This is a very intuitive choice which also hooks with the bonus trigger, but the accumulated score certainly affects the velocity of this indicator over time.

Another choice is to divide the score by the multiplier, whether it's the ordinary or bonus multiplier does not matter too much because that is just a constant scaling (almost). The problem is that player's VG activity isn't constant either: higher multipliers are expected at the end so they prefer to spend flags towards the end. With flags around the score obtained by not spending flag is basically negligible. We need to capture when people spend flags.

With everything being non-linear it is so hard to decide the right exponent, so I decided to look at the accumulated score instead. It is natural to assume that players' activity is non-decreasing in general, then their points gained per hour, after enlarged by the linearly growing multiplier, is at least linear. As a result, the accumulated points are at least quadratic, i.e., $\Omega (t^2)$. 

Assume that the players' activity -- or the teams' activity as a whole, is of order $O(t^{1+k})$ then the accumulated score will be of order $O(t^{2+k})$. It is not hard to find that the accumulated score is indeed at the order $O(t^{2+k})$ for some small $k$ -- so let us just divide everything by $t^2$ before we look into the parameter $k$. 

Here are the two typical examples taken from VG 2021 June.

Example 1: VG final (F!Corrin vs Klein)

Example 2: VG quarterfinal 4

These are two typical matches in VG: example 1 is when the popularity of one clearly overwhelms the other, while example 2 happens when a team is of significantly higher popularly but not as extreme as example 1.

All the charts are time-normalized by $t^2$ where $t$ is the average of the two multipliers.

In the first chart, the orange and yellow line indicates the time-normalized boundary for bonus multipliers, while the blue line shows the normalized score difference. The second chart indicates the normalized score activity of the first team (a positive score difference means that the first team is leading). 

We expect the normalized player activity should be of order $O(t^{k-1})$, and from here we can estimate $k$. The spikes are when bonus multiplier happens.

We can see that the parameter $k$ clearly varies in different situation. We can assume that $k$ is close to 1 in example 1 while $k$ is clearly much smaller than 1 in example 2. In fact, $k=0.2$ is a pretty good estimate. The $k$ value can also be verified by checking the growth rate of the bonus boundary curve.

We can explain the correlation by how player anticipate the battle towards the end instead of casually spending their flags in the middle. Very interestingly the parameter $k$ seems to be independent of the ratio of player base size: the dominant a team is, the more bonus hours the opposite team will get. So in theory if the parameter is decided by the frequency of bonus hours, then the parameter for the two teams should be different, but that is not the case here. If we plot the activity of both teams on the same chart for example 1, we can see that the parameter for the two teams are more or less equal.

Calculating the parameter $k$ would be extremely helpful because we can then get a normalized data. (And we will cover that in the sequel of this article!)

The three states of players

With the above graph we see that the activity of the players divide into three categories, or three states that we call.

A team is in an excited state if it receives a long-waited bonus hour, or a bonus hours that shoots them to the leading position. The activity clearly spikes for this hour.

A team is in a post-excitement state if the bonus hour in excited state shoots them into leading position with bonus triggered on the opponent then the team enters the post-excitement state where activity is abnormally low because they run out of votes and flags.

A team is in a normal state otherwise. There are more fluctuation within this state depending on the score situation. 

We can plot the same for example 2 which is a lot more chaotic, but the activity is still clearly divided into the three states as described.

The pattern shows that most players aren't playing to optimize the chance to win as a team. Rushing to overtake the opponent early has little to no effect on the final result but looks good in a team sport, while the increased activity when a team enters bonus multiplier is natural as it maximizes score gain (for those who urgently need to spend flags).

I am actually quite surprised that this was not taken into account by most predictors, as it plays an important role at the end where people react to the hourly updated results vigorously. For example the JP predictor predicted a constant downward ping-pong at the end for example 1 (i.e., bonus for second team for 1 hour + no bonus for both teams for 1 hour alternately and these two together results in favor of the second team), but the reaction from the second team at hour 42 should be much more violent as it sends the team into leading position -- and of course the first team hit them back with 10x the power. This is the nature of VG predictor: players react to score but not team result.

Daily variation on player activity

Clearly players do not stay awake 24/7 for these shitty reward (well even if orbs are worth its monetary cost that is merely a burger meal, so it's not worth the time to stay awake overnight), so they stop playing when they are asleep. But people over the world situates in different time zone and they sleep over different time. More importantly, the taste from different part of the world seems to be different. As a result, we may observe higher activity from one team during daytime then higher activity from the other team during nighttime (which could then be daytime for that part of the world).

Assuming that the main division would be Japanese (or Asian) players vs the West, we can divide the players into 3 groups: time-invariant players, Asian players and Western players. We estimate the portion of the three and how supportive they are to each team. This is taken as a scaling factor when we predict the future outcomes. 

And how do we do that? Well this is simple linear algebra -- these three groups of players can be modelled into 3 kinds of waves: constant, sine and cosine waves, orthogonal to each other. We can then apply orthogonal projection to estimate the portion from each of the three. 


I really believe that the three factors together with what we already have around, builds a very accurate VG predictor, but surely no one would waste the time doing that.

To conclude this article let me show the graph plotted for the same VG but semifinal 2, which is ping-pong all the way. I stacked (apology for the poor stacking) the two graphs together so that you can observe the interaction more clearly. Since it's a perfect ping-pong first the first 20 hours or so, you can see that the activities alternates from excited and post-excitement states but not the normal state. Although the waving pattern in the first 20 hours which in fact, also occurred in the first two examples, seems to raise more questions from here...

(To be continued)

Acknowledgements: raw data extracted from the Japanese predictor by @rammtiger_n

Wednesday 2 June 2021


The pride of London. Captured from ChelseaTV.