How to evaluate an ordering quiz

In a virtual pub quiz last week, Lucy and I were responsible for hosting the science round. Here was our question.

Put the following events in chronological order:

1. Discovery of the structure of DNA
2. Invention of the telescope
3. Periodic table developed
4. Invention of spectacles
5. Extinction of the dodo
6. Discovery of Uranus
7. Einstein’s theory of general relativity
8. First use of fingerprints in detection
9. Darwin’s theory of evolution
10. BCG vaccine first used on humans

Go on, have a go, write down your guess. I will tell you the correct answers in a moment, but before I do, how are you planning on evaluating your performance? What if you get 9 out of 10 events in the right order, but it turns out the event you thought might be last was actually first. Then you got 0 out of 10 events in the right place, but surely you deserve more than 0 points.

One reasonable way of doing this is by counting inversions. There are 45 pairs of events, and you deserve a point for every pair of events you get in the right order. An annoyance with this in practice is that it’s a little tedious to calculate quickly (there are 45 pairs to check), so I wrote a little tool for doing it. Try it in your own virtual pub quiz! (You may want to rescale to a 0–10 standard.)

Anyway, here is the correct ordering:

Invention of spectacles (1290)
Invention of the telescope (1608)
Extinction of the dodo (1662)
Discovery of Uranus (1781)
Darwin’s theory of evolution (1859)
Periodic table developed (1869)
First use of fingerprints in solving crime (1892)
Einstein’s theory of general relativity (1915)
BCG vaccine first used on humans (1921)
Discovery of the structure of DNA (1953)

Mathematically, assume the events are represented by the symbols {1, \dots, n} in such a way that the correct ordering is {1, \dots, n}, and the proffered solution is {\pi}, which is some permutation of {1, \dots, n}. The number of inversions in {\pi} is defined by

\displaystyle  \textup{inv}(\pi) = \# \{(i, j) : 1 \leq i < j \leq n, \pi(i) > \pi(j)\}.

This quantity goes by many names, e.g., bubble-sort distance in computer science, Kendall tau distance in statistics, Coxeter length in group theory.

If {\pi} is sorted then {\textup{inv}(\pi) = 0}, while if {\pi} is anti-sorted then {\textup{inv}(\pi) = \binom{n}{2}}, and

\displaystyle \mathbf{E} [\textup{inv}(\pi)] = \frac12 \binom{n}{2}.

Write

\displaystyle \textup{inv}(\pi) = \sum_{1 \leq i < j \leq n} I_{ij}(\pi),

where {I_{ij}} indicates the event that {\pi(i) > \pi(j)}. The pairwise correlations are readily calculated, because they reduce to statistics of 3-symbol permutations. Certainly {I_{ij}} and {I_{i'j'}} are independent if {\{i, j\} \cap \{i', j'\} = \emptyset}, while

\displaystyle \begin{array}{rcl} \mathbf{E} I_{ij} I_{i'j} &=& 1/3 \qquad (i < i' < j),\\ \mathbf{E} I_{ij} I_{ij'} &=& 1/3 \qquad (i < j < j'),\\ \mathbf{E} I_{ij} I_{jk} &=& 1/6 \qquad (i < j < k). \end{array}

It follows that

\displaystyle \begin{array}{rcl} \textup{Var}[\textup{inv}(\pi)] &=& \binom{n}{2} \frac{1}{4} + \binom{n}{3} \left( 2 \times (1/3 - 1/4) + 2 \times (1/3 - 1/4) + 2 \times (1/6 - 1/4) \right)\\ &=& \frac{n(n-1)}{8} + \frac{n(n-1)(n-2)}{36}. \end{array}

(Given a triple {i < j < k}, how many times do we have {\{i, j, k\} = \{i, j\} \cup \{i', j'\}} in each of the three cases above? Twice, twice, twice.) Call this {\sigma^2} (note {\sigma \sim n^{3/2} / 6}). Then, statistically, if our quiz participant is just a random monkey, we expect

\displaystyle \textup{inv}(\pi) \approx \frac12 \binom{n}{2} \pm \sigma.

The script in the link above computes

\displaystyle Z = \frac{\frac12\binom{n}{2} - \textup{inv}(\pi)}{\sigma},

which under the null hypothesis has zero mean and unit variance. Score more than 2 to convince me you’re not a monkey!

 

Leave a comment