Saturday 28 November 2020

Thoughts on Simon Marais 2020

Simon Marais 2020 was held on October 10, which is long ago. Due to many coincidences I and my affiliated school missed the chance to take part in the competition. At the same time I didn't have the chance to review the questions until now, so this is a very brief and irresponsible review on the exam this year.

A1: closed curve and stupid pigeonhole. 

A2: clear once you figure out how the piling interacts each other, especially when $k \mid n$ or not.

A3: nicely formulated question, although the solution reduces drastically to a bound instead of some fancy sets. The beauty of the solution lies on fact that this sum can be optimized greedily. Locally this is simply year 1 differentiation, and you will reckon that a specific geometric series will do the job.

A4: yeah it sounds fancy and intimidating, but someone may as well brute force all the way with coordinate geometry techniques. I do not see that to be more difficult than IMO-level questions that can also be defeated using co-geom brute force techniques.

B1: this is more like assignment question...not even interested to do that. Question of such depth should not even appear in these contests.

B2: oh unit fractions. This is also a nicely written question. The solution lies on the fact that you can order arrangements in $S_k$ such that raising the sum means a descent in order, which means that it does not go forever.

B3: this is the kind of question that I do not like, where you either know one trick that you easily solves the question, or you have no chance at all. We have had enough cat and mouse questions.

B4: Again, possible (a) and impossible (b).

We see that the difficulty has been pretty consistent in the past years, but we have yet to observe more abstractly formulated questions as in the Putnam exam. Questions that can be formulated rather easily, but the solution requires further thoughts. Again I strongly recommend the removal of Q1, and extend the exam into 6 questions with a more approachable difficulty ladder.

Oh well, I have had a nice afternoon solving these problems.

Tuesday 17 November 2020

18/11/20

 總覺得那個舊網頁標題banner有點礙眼所以拿掉了。其實那個是當年剛買到繪版時的試筆,蠻有記念價值的……只是跟現在的我有點脫節了。

其實這個網誌也是啦:2007年我寫的東西、2011的東西、2015年的東西跟2020年的東西也是天差地別,要我翻回看的話還有點臉紅。不過我肯定不會下架以前寫的東西,畢竟這是我存在的證明。

所以,也許,該找一個合適的新banner了嗎……?

Wednesday 4 November 2020

On the day of the 2020 US Election


 Result as of 6:07 ET.

It is now 11PM ET. Trump is looking good towards his second term, and regardless of that, the turnouts are clearly in favor of him relative to the mainstream media polls.

The question is, where is the error coming from?

I write lots of things on my blog, including politics. However in this article I really wanted to look into the statistical aspect of the problem, instead of who is more correct politically or is more capable of making America great again. This article is not meant to be extensive -- I am sure there are more statisticians or data scientists who are more capable of giving precise figures than I do. This is a record of my own observation. 

The traditional polls

In traditional statistics there are two stages in the process -- you first collect the data, then you do the inference.

When collecting the data there are two roles of course, the data collector and the samples.

On the data collector side things could be biased based on how the poll is done. These includes designing leading questions, or hinting the presupposition during the poll. While I do not have clear evidence for these -- well I did not look into the polls -- we have another potential sampling error here. That is, to sample from an unrepresentative pool.

There are a few categories where polls may look into. They either take samples from adults, registered voters or likely voters. Likely voters of course have a (much) higher chance of voting hence are more influential. A number of mainstream media polls focused on adults of registered voters only, and that seemed to be dems biased -- or maybe because they know that collecting data in this way could be biased in the favorable way so that they decided to take such method?

Responses from the crowd certainly heavily affect the outcome. First the weighing might shift from what happened in 2016 or even 2018, especially given the volatile political environment right now. Even assuming that to be constant, the big problem comes from the 2020 version of the Bradley effect -- how many voters are the so-called shy-Trump-voters who are not willing to express their opinion in the polls? People are skeptical about that, while big data and early results say otherwise. Bradley effect is clearly taking in place.

Now we look into the inference part. If we believe that the poll makers will follow the moral of statistical inference (which is a bold statement to make from what's been observed), then what could go wrong?

It's mainly about the margin of error [https://www.pewresearch.org/fact-tank/2016/09/08/understanding-the-margin-of-error-in-election-polls/] -- when you give a confidence interval on the lead instead of of the poll percentage, the margin of error is doubled. That's because whatever that does not go to one party -- let's assume that the liberal party...or Kayne West are negligible -- goes to the other, creating a doubled difference. 

Alternative methods

There are a few polls who tried new methods of investigating shy voters. Trafalgar group adopted a mixed method [https://www.thetrafalgargroup.org/polling-methodology/] in order to prevent the "social desirability gap". Democracy Institute tries to figure out the true preferences of the voters by asking extra questions like whether or not they think Trump will win, or who they think their neighbors would vote for. Although Project538 apparently does not like the method, but traditional methods are off as far as we observed. That may as well open a whole new area on studying how to obtain the true data when interviewees intentionally hides their true preferences. 

Of course there are models too. These models in nature aimed to figure out the willingness of people voting for particular candidates without really making a poll. These include the famous Primary model [http://primarymodel.com/]. Others include a delayed correlation on media noise and so on. Some of these are also digitalized and measured on online platforms. 

Interestingly these are in general more favorable towards Trump -- some even gave predictions that are too good to be true. For example the Primary model predicted 362 -- while 300-plus isn't impossible, anything above 320 seems very unlikely. The thing is that the electoral college is not a smooth scale. Not only that the electoral votes jump according to the votes assigned to the states, it's also because states beyond 320 are all deep blue. The chance to flip any of them would be exponentially harder than the swing states. While the models are designed to reflect who will win, they may not extrapolate to landslide victories.

Models that are not done state-wise is certainly having similar problems. They failed to distinguish what happened in different states, and that could produce a huge difference when it comes to votes predictions.

One final thought

The unusually high voting rate is special for this election. Given the same poll result, the outcome with a lower voting rate will certainly be different with the outcome with a high voting rate. That is because the group of people on the edge of going or not going to vote, is also not scaled smoothly. 

This is hugely different from elections in Hong Kong, where we may comfortably assume that pro-government voters are fully utilized regardless of the overall willingness to vote. Thus any extra votes will be heavily biased to the pro-democracy side. 

Such assumption is false for the US election, because the voter composition of both parties are highly sophisticated and dynamic. We can easily give numerous reason on why the extra votes would be biased on one party or the other. 

For example, Biden supporters may say that extra votes are more likely for the dems. That's because we are observing a historical high on absentee ballots, which contains surely a lot more dem votes. 

On the other hand Trump supporters are justified to believe that the extra votes are in favour of the reps: the rally showed that Trump supporters are more active and more motivated to vote. One also raised interesting observations that covid which triggered distant learning, which reduced peer pressure from college friends who are in general leaning to the dems.

If we keep going deeper, how does the the variation depends on the default stance of the counties (or, as shown on the polls)? Or variation against county population, income, age distribution and so on? 

No matter what the answer is, we will learn much from exit polls of the current election. Together with all the new polling methods and models, there are too much for us to investigate, scientifically.

6:00 AM ET 4/11/2020 (yes, 7 hours after I started writing this, because the live feeds are overwhelmingly interesting to watch)

*At the moment WI, NV and AZ are in extremely close match. Oh this election is so interesting...

Sunday 1 November 2020

01/11/2020: 秋番簡評

當我拔出第二把刀時,女主角就會從天而降--
來源:這是妳與我的最後戰場,或是開創世界的聖戰第1集
很好看不過不在本次介紹之列。

已經過去的夏番季度持續受到冬季和春番延期的影響,成為近年最貧脊的動畫季度之一。不少作品比如地下城都挑了秋季這個最快的季度將動畫推出市場,使得今年秋番名單異常擁擠,這季我預定要追的番一開始居然有二十部之多。

雖然本季作品數異常地高,但卻沒有發生紫羅蘭VS國家隊VS PPTP這種神仙打架的局面。本季的巨無霸當屬地城和魔劣兩部都是成名已久兼為續作,粉絲自然不會吵起來。在地城和魔劣以下的實力派亦有一大堆,每部都有自己獨特的賣點,大家專心看番也不用吵誰才是本季霸權。以下就由我來推介幾部本季有趣的番,順便說一下看了兩三集後的感想。

<以下可能含有劇透,小心服用>

催眠麥克風 3/12集 7.5/10

人權證明。

男性向賣歌的虛擬偶像企劃。我從企劃伊始就有在看,從CD到live跟手遊都有課過……應該算是原作粉絲吧?

動畫化目前看起來套路甚至內容本身都非常貼近現有的drama跟漫畫,所以看點就是動畫能否將聲畫的效果合一形成更佳的感受。在這點來看動畫的效果基本上就是drama與漫畫的放大版:動畫的藝術設計非常足本,rap段動畫可以看成手遊rap動畫的進化版,從動圖變成了3D特效,但角色風格還是一致地表達了出來。

當然,整個企劃的「尷尬」感覺也隨之而放大了:兩隊大男人拿著麥克風對戰的場面,一言不合就rap起來,而且還在街上一邊跳一邊rap……本來聽drama或者看漫畫都沒這麼尷尬,但看動畫這感覺則變得很明顯。不過這也不算壞事吧?另外提一點,主角們的眼睛眉毛十分妖艷,看漫畫和玩手遊時還不算顯眼,但動畫用全屏看的話很難注意不到。嘛,畢竟是主角呢。

整體上符合預期,效果比以往的形式都好,就看編劇有沒有打算補完一些細節了。

全員惡玉 4/12集 7/10

數碼龍克(cyberpunk)風格大阪(關西)為主體的惡人組合番劇。

數碼龐克的風格與浮誇的大阪可謂天作之合。動畫花了大量分鏡描繪出用霓紅燈堆砌而成的虛構城市「關西」,加上通天閣等幾棟名物,不需太多旁白描述而觀眾自然會有「啊,果然是大阪」的感覺。制作組沒有滿足於開場對關西的影頭,每一集都為這個城市添加了新的元素;另外第三集在酒店房間的打戲從角色走位到燈光(顏色)與佈景運用都十分流暢,絕對是電影級別的藝術。

數碼龐克的另一個特色是反烏托邦設定,而這個「關西」也不例外。大阪有臭名昭著的西成區,這裡整個關西都是相對於關東下的貧民區。以核幅射與高壓統治將人民鎖在「關西」裡,正正就是數碼龐克的反烏托邦社會。在這種壓抑的社會下變革的火花自然不是甚麼好人,一群「十惡不赦」的惡人加上陰差陽錯(?)地加入的女主,一躺豪邁的旅程自此開始。

這部的角色設計真的沒話說,而且打戲也畫得很精緻。雖然這種一人一種超能力(搭一個平凡人)的故事倒是挺套路的,但看過第四集後關西與關東的對比似乎有更深的意味。目前女主與其他角色的互動略嫌不足,不過這也可能是因為三集實際對應的時間只有短短幾天吧。這方面可以往後再評論,但不論如何這部番幾可肯定是觀賞性十足的一部。

總之就是很可愛 4/12集 8.5/10分

把頭髮染淺色一點,你就可以去當管家了。反正老婆都是大小姐嘛。
來源:總之就是很可愛第3集

火田老賊(畑健二郎)與聲優老婆的狗糧番,與你與我最後的戰場並列本季閃光之最。女主角聲優不選用淺野真澄反倒是一種另類的曬老婆吧…?

不論是男主的內心小劇場與實際上的精明能幹,跟女主整家的互動,再到路人和細節的各種neta處處散發著旋風管家的味道。這就是火田老賊的風格,用簡單的線條編織出的故事一點都不粗疏,每集23分鐘體感時間可能只有三分鐘,但這三分鐘足以讓你已經吸收了整杯台南甜度珍珠奶茶的糖份。

這種番真的不用太多言語描述,隨便上youtube看一下剪緝就知道了。看完這番順便回去再看旋風管家吧,要TVB(聰聰21)配音的。

憂國的莫里亞蒂 4/12集 7.8/10

福爾摩斯式推理之所以迷人,是因為它沒有借助現代鑑證的力量,是一般人所能推理的極限。比起所有提示都明確寫出來的本格推理這種敘事方式更適合在動畫中展現。

比照套路,主角當然要是高富帥,身邊再帶一兩個助手(華生),男女不拘。女的話必然是女主然後走戀愛路線,比如京都寺町三條商店街的福爾摩斯。(題外話,這部作品的手遊根本就是Candy crush,而且吃電吃的不行……)

然後要選一個時代的話果然還是1900年代作為世界中心的倫敦最好。處於歐洲文明黃金時代的世界中心可以滿足任何價值衝突的背景,比如封建與資本、法治和人治……像大逆轉裁判甚至是Princess Principal那樣半架空兼蒸汽龐克的動畫都以那個時代的倫敦為參照背景。

憂國的莫里亞蒂就是集齊以上三個王道設定的作品:在19世紀末倫敦的帥哥的推理作品,更貼切來說是犯罪作品,男主透過對推理的理解操控身邊的棋子以進行自己理想中對社會的變革。

動畫本身到目前為止真正有案件的只有第一集。比起相對平淡的劇情本身,我更感興趣的是動畫用了非常多的蒙太奇技巧將犯罪本身用更藝術性的方式表達出來。比起直接拍出犯罪本身,染血的聖母像和地上的血汗不但能避開18禁的內容,亦能假借剪影的內容向觀眾暗示出罪行本身之齷齪。片頭兇手動手與片尾男主行刑用上的手法,兇手的飢不擇食與男主的穩操全局對比,更能表現出男主的神祕感與手段之高明。

動畫中的推理比起其他同類作品只能算是合格。有些透露出來的資訊是多餘的,但不屬於誤導觀眾的資訊,單純放在那邊讓觀眾胡思亂想而已。部分應該是從漫畫版刪減時沒處理好的部分。比如第二三集中老莫里亞蒂說過自己知道為何被要求領養孩子,但動畫中對他著墨太少,觀眾根本猜不出甚麼來。漫畫版中莫里亞蒂父母與哥哥的戲份明顯多很多,莫爸與其他貴族的互動引證了他虛榮勢利的性格,答應收留養子只是為了搭上公爵千金而已。類似因刪減造成的BUG不少,希望編劇能注意一下。不過聽說原創的部分相當多?

要在前幾集就斷定動畫劇情好壞很難,不過這部番的分鏡與特效處理跟全員惡玉一樣有特別下功夫,OP畠中祐的Dying Wish也是我喜歡的半古典歌劇風,而且主角還是有點病態的帥哥。光是這樣就值得追下去了吧?

成神之日 4/12集 8/10分

麻枝准終於把注意力從護士小姐姐那邊拉回來了。這次是他收下大家對他Charlotte的指教後第一部大作,是反省還是報復,尚未可知。說起來Charlotte也是Angel Beats後的報復……

不過可以肯定的是,麻枝的招牌元素肯定都在。當初大家開賭盤猜棒球會第幾集出來,沒想到第一集就殺大家一個猝不及防,此外麻雀也會在第四集出現。另一個沿用下來的元素就是請佐倉綾音為女主配音了。佐藤雛跟友利奈緒在性格上當然有很大的分別,但她們傲嬌起來倒是十分相似,精神分裂式用各種語氣進行吐糟正是佐倉的強項。Charlotte首播的2015是佐倉的巔峰時期,但在2020年還記得她的配音功力真是太好了。

有些人會說前面陽太的互動對應都是麻枝本人的反省,我是沒有在這方面想太多。我看到的是前面是稍帶一點科幻但整體是是青春滿溢的男子高中生暑假日常。不過隨著平行故事線逐漸浮現,加上第一集就開始倒數的時間,我只能做好下一集就開始胃痛的心理準備……

不得不說,麻枝准在輕鬆搞笑方面真的沒讓人失望過。棒球那不存在的第四好球、拉麵的味精層次都讓人笑個不停,只是麻雀新役種的話我笑完頭還會開始痛起來。這種一集一個主題的日常效果真的很好,但別忘記Angel Beats與Charlotte分出高下正正在於進入胃痛部分時劇情會不會暴走。大河內一樓的Princess Principal之所以完滿結束是因為最後兩集編劇換成別人了,但是是Key社和麻枝的話當然沒法換人。所以甚麼時候胃痛,是好看的胃痛還是難看的胃痛,全在他的一念之間。

我們在這段時間就先好好欣賞佐倉的演技吧。

*

最後補充一句,今季的番實在太多,我也不打算逐一細評和介紹。熱門番如地錯與魔劣都不需要我再推廣甚麼,因此我選的都不是最熱門但我認為有趣的番。

不知道我有沒有空在季末再寫一個總結,不過有空的話我應該會繼續寫小說呢。

01.11.2020 滋賀