Tuesday 2 July 2024

Optimal difficulty curve for tests

A sequel to my previous babble about binomial distributions, but not really.

Is it better for test score to be uniformly distributed than normally distributed?

The aptitude distribution of the students is normal, this is not we can control. However we can control the difficulty curve of the test so that the test result can be of arbitrary distribution.

To start with, let the aptitude of students be $X\sim Z(0,1)$. A test $f$ is a  non-decreasing surjection onto [0,100]. If the test is a piecewise linear function that is zero before a threshold and 100 after another threshold, then we expect the result to be normal (with slight truncation). We can also use an inverse normal curve as the test. In that case, $f(X)$ will become a uniform distribution.

However, students does not always perform to their quality. We need a noise term in addition. The resulting performance is now $f(X) + \varepsilon$, where we assume $\varepsilon$ to be a normal noise. The initial question becomes to find $f$, or rather $f(X)$, such that given two samples $x,x'$ from $X$ with $x \geq x'$, $P((f(x)+\varepsilon(x))-(f(x')-\varepsilon(x'))\mid x\geq x')$ is the greatest. i.e., we want to seek for $f(X)$ so that the chance of misordering student aptitude is minimized.

Despite the complexity of normal density, the above probability can easily be written as a double integral, then some variational method potentially gives the answer. But here is a simpler way to find the answer: to minimize the probability is the same as minimizing $p_f(r) = P(|f(x)-f(x')|\leq r)$ for any two samples from $X$. If there is a single $f$ that minimizes $p$ across all $r$ then such $f$ would surely minimizes the stated probability.

Unsurprisingly, uniform distribution is what we desired. The reason is simple: majorization. The more sparse the distribution is, the lower $p_f$ would be, and the best possible one is the uniform distribution $U(0,100)$. 

From an information theory perspective, this is clear because uniform distribution has the highest entropy, thus gives highest resolution. The answer is so smooth with multi-perspective explanations, except that if $f(X)$ is uniform that means we are going to fail half of the class...

So no, don't do that to the students.

No comments:

Post a Comment