Joseph’s conjectures about commuting probability

The commuting probability P(G) of a finite group G is defined to be the probability that two uniformly random elements x, y \in G commute. If G has order n and k conjugacy classes, this is the same thing as k / n. You might think this is a good way of measuring how close a group is to being abelian, but a remarkable and well-known fact is that no finite group satisfies 5/8 < P(G) < 1, so in fact nonabelian groups never get close to being abelian, in this sense.

In the 1970s Keith Joseph made three insightful conjectures about the set of all possible commuting probabilities S = \{P(G) : G~\text{a finite group}\}.

  1. every limit point of S is rational,
  2. for every x \in (0, 1] there is some interval (x-\epsilon, x) that S avoids,
  3. every nonzero limit point of S is in S.

Note that 3 is stronger than 1.

In 2014 I proved the first two of these conjectures (building on an earlier paper of Hegarty), and then I publicly expressed doubt about the third. My doubt was largely based on the observation that there is a family of 2-groups whose commuting probability converges to 1/2, but no 2-group has commuting probability 1/2 (although S_3 does).

But I was wrong! The third conjecture was proved recently in this paper of Thomas Browning:

The characteristic polynomial of a random matrix is almost always irreducible

I just uploaded to the arxiv a paper which can be viewed as vindication by anybody who has ever struggled to compute an eigenvalue by hand: the paper gives mathematical proof that eigenvalues are almost always as complicated as they can be. Intuitive as this may be, to prove it I needed two mathematical bazookas: both the extended Riemann hypothesis and the classification of finite simple groups!

An example will illustrate what I mean. Here is a random {10\times 10} matrix with {\pm1} entries:

\displaystyle  M = \left(\begin{array}{rrrrrrrrrr} -1 & 1 & 1 & -1 & -1 & 1 & -1 & 1 & -1 & 1 \\ -1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & -1 & -1 \\ -1 & -1 & 1 & 1 & 1 & -1 & -1 & -1 & 1 & 1 \\ 1 & -1 & -1 & 1 & -1 & -1 & 1 & 1 & 1 & -1 \\ -1 & -1 & -1 & 1 & 1 & 1 & 1 & -1 & -1 & -1 \\ -1 & -1 & -1 & -1 & -1 & 1 & 1 & 1 & 1 & 1 \\ -1 & -1 & -1 & 1 & -1 & 1 & 1 & 1 & -1 & 1 \\ 1 & -1 & -1 & -1 & 1 & 1 & 1 & 1 & -1 & -1 \\ -1 & -1 & 1 & -1 & 1 & -1 & -1 & -1 & -1 & -1 \\ -1 & -1 & -1 & -1 & -1 & 1 & -1 & 1 & -1 & -1 \end{array}\right) .

The eigenvalues of {M} as approximated by scipy.lingalg.eig are shown below.

But what if we want to know the eigenvalues exactly? Well, they are the roots of the characteristic polynomial of {M}, which is

\displaystyle  \varphi(x) = x^{10} - 4x^{9} + 6x^{8} - 80x^{6} + 144x^{5} - 544x^{4} + 1152x^{3} - 1024x^{2} + 1280x - 1024,

so we just have to solve {\varphi(x) = 0}. The Abel–Ruffini theorem states that, in general, this is not possible in the quadratic-formula-type sense involving radicals, etc., so there are certainly evil matrices that defy solution. But could there be something special about random matrices that makes {\varphi} typically more special?

It is not at all obvious that {\varphi} is “generic”. For one thing, if {M} is singular, then {\varphi} has an obvious root: zero! For another, if {M} happens to have a repeated eigenvalue, then {\varphi} and {\varphi'} share a factor, which is certainly a special algebraic property. It turns out that these events are low-probability events, but those are landmark results about random matrices (due originally to Komlós and Tao–Vu (symmetric case) and Ge (nonsymmetric), respectively).

Algebraically, the complexity of solving {\varphi(x) = 0} is measured by the Galois group {G}. In particular, the polynomial is irreducible if and only if {G} is transitive, and the equation is solvable by a formula involving radicals as in the quadratic formula if and only if {G} is solvable in the sense of group theory. Galois groups are not easy to calculate in general, but “typically” they are, by the following well-known trick. If we factorize {\varphi} mod {p} for some prime {p}, then, as long as we do not get a repeated factor, the degree sequence of the factorization is the same as the cycle type of some element of {G}. If we do this for enough small primes {p} we can often infer that {G = S_n}.

Let’s do this for the above example. If we factorize {\varphi} mod {p} for {p < 20}, discarding the “ramified” primes for which there is a repeated factor, we get the following degree sequences:

{p} degree sequence
{5} {\left(9, 1\right)}
{7} {\left(4, 4, 2\right)}
{13} {\left(5, 3, 2\right)}
{17} {\left(5, 3, 2\right)}
{19} {\left(7, 2, 1\right)}

Therefore {G} contains a {9}-cycle, an element of type {(4, 4, 2)}, etc. Just from the {9}-cycle and the {(4,4,2)} element it is clear that {G} must be transitive (in fact 2-transitive), so {\varphi} is irreducible. The are various other rules that help eliminate other possibilities for {G}. A common one is a theorem of Jordan which states that if {G} is primitive and contains a {p}-cycle for some prime {p \leq n - 3} then {G \geq A_n}. The element of cycle type {(5, 3, 2)} is odd and has a power which is a 5-cycle, so Jordan’s theorem implies {G = S_{10}}.

Conclusion: for the matrix {M} above, solving {\varphi(x) = 0} is a certifiable pain. You should just give up and accept {\varphi} itself as the prettiest answer you are going to get.

What my paper shows in general is that, if you choose the entries of an {n\times n} matrix from a fixed distribution in the integers,then, with probability tending to {1} as {n \to \infty}, the characteristic polynomial {\varphi} is irreducible, and moreover its Galois group is at least {A_n}. The result is conditional on the extended Riemann hypothesis, as mentioned, except in some nice special cases, such as when the entries are uniformly distributed in {\{1, \dots, 210\}.}

Note I only said “at least {A_n}”. Presumably the event {G = A_n} also has negligible probability, so {G = S_n} with probability {1 - o(1)}, but this remains open, and seems difficult. To prove it (naively), you have to show that the discriminant of the characteristic polynomial is, with high probability, not a perfect square. That should certainly be a low-probability event, but it’s not clear how to prove it!

Why do I need the extended Riemann hypothesis (ERH) and the classification of finite simple groups (CFSG)?

  • ERH: The example above involving factorizing mod {p} is not just useful in practice but also in theory. The prime ideal theorem tells you the statistics of what happens with your polynomial mod {p} for large {p}. In particular, to show that {G} is transitive it suffices to show that {\varphi} has one root on average mod {p}, for large primes {p}. Exactly how large {p} must be is determined by the error term in the prime ideal theorem, and ERH really helps a lot.
  • CFSG: An elaboration of the above strategy is used to show that {G} is not just transitive but {m}-transitive for any constant {m}. By CFSG, {m = 4} and {n > 24} is enough to imply {G \geq A_n}.

Arguably, matrices are not playing a large role in the statement of the theorem: it is just a statement about a random polynomial {\varphi}, where the model happens to be “take a random matrix with independent entries and take its characteristic polynomial”. The theorem should be true for any good model for a random polynomial of high degree. For example, in the “independent (or restricted) coefficients model”, the coefficients of {\varphi} are independent with some fixed distribution. For this model the corresponding statements were proved recently by Bary-Soroker–Kozma (see also Bary-Soroker’s blog post) and Breuillard–Varju, and I certainly borrow a lot from those papers. However, the common ground is isolated to the local–global principle that reduces the problem to a problem modulo {p}, and there the commonality ends. The independent coefficients case becomes a question about a certain kind of random walk (see the Breuillard–Varju paper) mod {p}, whereas the characteristic polynomial case becomes a problem about counting eigenvalues of a random matrix mod {p}.

There are many beautiful questions and mysteries about random polynomials. See for example the many high-definition pictures on John Baez’s website here showing the roots of random polynomials with independent coefficients. In many ways, taking the characteristic polynomial of a random matrix is actually a more natural model of a random polynomial, and some aspects become simpler. For example, I bet that the {A_n} vs {S_n} question will be settled sooner in the case of a random characteristic polynomial than in the case of independent coefficients.