Research Updates: Boston–Shalev for conjugacy classes, growth in linear groups, and the (amazing) Kelley–Meka result

1. Boston–Shalev for conjugacy classes

Last week Daniele Garzoni and I uploaded to the arxiv a preprint on the Boston–Shalev conjecture for the conjugacy class weighting. The Boston–Shalev conjecture in its original form predicts that, in any finite simple group G, in any transitive action, the proportion of elements acting as derangements is at least some universal constant c > 0. This conjecture was proved by Fulman and Guralnick in a long series of papers. Daniele and I looked at conjugacy classes instead, and we found an analogous result to be true: the proportion of conjugacy classes containing derangements is at least some universal constant c' > 0.

Our proof depends on the correspondence between semisimple conjugacy classes in a group of Lie type and polynomials over a finite field possibly with certain restrictions: either symmetry or conjugate-symmetry. We studied these sets of polynomials from an “anatomical” perspective, and we needed to prove several nontrivial estimates, e.g., for

  • the number of polynomials with a factor of a given degree (which is closely related the “multiplication table problem”),
  • the number of polynomials with an even or odd number of irreducible factors,
  • the number of polynomials with no factors of small degree,
  • or the number of polynomials factorizing in a certain way (e.g., as f = gg^*, g irreducible, g^* the reciprocal polynomial).

For a particularly neat example, we found that, if the order of the ground field is odd, exactly half the self-reciprocal polynomials have an even number of irreducible factors — is there a simple proof of this fact?

2. Growth in Linear Groups

Yesterday Brendan Murphy, Endre Szabo, Laci Pyber, and I uploaded a substantial update to our preprint Growth in Linear Groups, in which we prove one general form of the “Helfgott–Lindenstrauss conjecture”. This conjecture asserts that if a symmetric subset A of a general linear group \mathrm{GL}_n(F) (n bounded, F an arbitrary field) exhibits bounded tripling, |A^3| \le K|A|, then A suffers a precise structure: there are subgroup H \trianglelefteq \Gamma \le \langle A \rangle such that \Gamma / H is nilpotent of class at most n-1, H is contained in a bounded power A^{O_n(1)}, and A is covered by K^{O_n(1)} cosets of \Gamma. Following prodding by the referee and others, we put a lot more work in and proved one additional property: \Gamma can be taken to be normal in \langle A \rangle. This seemingly technical additional point is actually very subtle, and I strongly doubted whether it was true late into the project, more-or-less until we actually proved it.

We also added another significant “application”. This is not exactly an application of the result, but rather of the same toolkit. We showed that if G \le \mathrm{GL}_n(F) (again F an arbitrary field) is any finite subgroup which is K(n)-quasirandom, for some quantity K(n) depending only on n, then the diameter of any Cayley graph of G is polylogarithmic in the order of |G| (that is, Babai’s conjecture holds for G). This was previously known for G simple (Breuillard–Green–Tao, Pyber–Szabo, 2010). Our result establishes that it is only necessary that G is sufficiently quasirandom. (There is a strong trend in asymptotic group theory of weakening results requiring simplicity to only requiring quasirandomness.)

The intention of our paper is more-or-less to “polish off” the theory of growth in bounded rank. By contrast, growth in high-rank simple groups is still poorly understood.

3. The Kelley–Meka result

Not my own work, but it cannot go unmentioned. There was a spectacular breakthrough in additive combinatorics last week. Kelley and Meka proved a Behrend-like upper bound for the density of a subset A \subset \{1, \dots, n\} free of three-term arithmetic progressions (Roth’s theorem): the density of A is bounded by \exp(-c (\log n)^\beta) for some constants c, \beta > 0. Already there are other expositions of the method which are also worth looking at: see the notes by Bloom and Sisask and Green (to appear, possibly).

Until this work, density 1 /\log n was the “logarithmic barrier”, only very recently and barely overcome by Bloom and Sisask. Now that the logarithmic barrier has been completely smashed, it seems inevitable that the new barometer for progress on Roth’s theorem is the exponent \beta. Kelley and Meka obtain \beta = 1/11, while the Behrend construction shows \beta \le 1/2.

The characteristic polynomial of a random matrix is almost always irreducible

I just uploaded to the arxiv a paper which can be viewed as vindication by anybody who has ever struggled to compute an eigenvalue by hand: the paper gives mathematical proof that eigenvalues are almost always as complicated as they can be. Intuitive as this may be, to prove it I needed two mathematical bazookas: both the extended Riemann hypothesis and the classification of finite simple groups!

An example will illustrate what I mean. Here is a random {10\times 10} matrix with {\pm1} entries:

\displaystyle  M = \left(\begin{array}{rrrrrrrrrr} -1 & 1 & 1 & -1 & -1 & 1 & -1 & 1 & -1 & 1 \\ -1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & -1 & -1 \\ -1 & -1 & 1 & 1 & 1 & -1 & -1 & -1 & 1 & 1 \\ 1 & -1 & -1 & 1 & -1 & -1 & 1 & 1 & 1 & -1 \\ -1 & -1 & -1 & 1 & 1 & 1 & 1 & -1 & -1 & -1 \\ -1 & -1 & -1 & -1 & -1 & 1 & 1 & 1 & 1 & 1 \\ -1 & -1 & -1 & 1 & -1 & 1 & 1 & 1 & -1 & 1 \\ 1 & -1 & -1 & -1 & 1 & 1 & 1 & 1 & -1 & -1 \\ -1 & -1 & 1 & -1 & 1 & -1 & -1 & -1 & -1 & -1 \\ -1 & -1 & -1 & -1 & -1 & 1 & -1 & 1 & -1 & -1 \end{array}\right) .

The eigenvalues of {M} as approximated by scipy.lingalg.eig are shown below.

But what if we want to know the eigenvalues exactly? Well, they are the roots of the characteristic polynomial of {M}, which is

\displaystyle  \varphi(x) = x^{10} - 4x^{9} + 6x^{8} - 80x^{6} + 144x^{5} - 544x^{4} + 1152x^{3} - 1024x^{2} + 1280x - 1024,

so we just have to solve {\varphi(x) = 0}. The Abel–Ruffini theorem states that, in general, this is not possible in the quadratic-formula-type sense involving radicals, etc., so there are certainly evil matrices that defy solution. But could there be something special about random matrices that makes {\varphi} typically more special?

It is not at all obvious that {\varphi} is “generic”. For one thing, if {M} is singular, then {\varphi} has an obvious root: zero! For another, if {M} happens to have a repeated eigenvalue, then {\varphi} and {\varphi'} share a factor, which is certainly a special algebraic property. It turns out that these events are low-probability events, but those are landmark results about random matrices (due originally to Komlós and Tao–Vu (symmetric case) and Ge (nonsymmetric), respectively).

Algebraically, the complexity of solving {\varphi(x) = 0} is measured by the Galois group {G}. In particular, the polynomial is irreducible if and only if {G} is transitive, and the equation is solvable by a formula involving radicals as in the quadratic formula if and only if {G} is solvable in the sense of group theory. Galois groups are not easy to calculate in general, but “typically” they are, by the following well-known trick. If we factorize {\varphi} mod {p} for some prime {p}, then, as long as we do not get a repeated factor, the degree sequence of the factorization is the same as the cycle type of some element of {G}. If we do this for enough small primes {p} we can often infer that {G = S_n}.

Let’s do this for the above example. If we factorize {\varphi} mod {p} for {p < 20}, discarding the “ramified” primes for which there is a repeated factor, we get the following degree sequences:

{p} degree sequence
{5} {\left(9, 1\right)}
{7} {\left(4, 4, 2\right)}
{13} {\left(5, 3, 2\right)}
{17} {\left(5, 3, 2\right)}
{19} {\left(7, 2, 1\right)}

Therefore {G} contains a {9}-cycle, an element of type {(4, 4, 2)}, etc. Just from the {9}-cycle and the {(4,4,2)} element it is clear that {G} must be transitive (in fact 2-transitive), so {\varphi} is irreducible. The are various other rules that help eliminate other possibilities for {G}. A common one is a theorem of Jordan which states that if {G} is primitive and contains a {p}-cycle for some prime {p \leq n - 3} then {G \geq A_n}. The element of cycle type {(5, 3, 2)} is odd and has a power which is a 5-cycle, so Jordan’s theorem implies {G = S_{10}}.

Conclusion: for the matrix {M} above, solving {\varphi(x) = 0} is a certifiable pain. You should just give up and accept {\varphi} itself as the prettiest answer you are going to get.

What my paper shows in general is that, if you choose the entries of an {n\times n} matrix from a fixed distribution in the integers,then, with probability tending to {1} as {n \to \infty}, the characteristic polynomial {\varphi} is irreducible, and moreover its Galois group is at least {A_n}. The result is conditional on the extended Riemann hypothesis, as mentioned, except in some nice special cases, such as when the entries are uniformly distributed in {\{1, \dots, 210\}.}

Note I only said “at least {A_n}”. Presumably the event {G = A_n} also has negligible probability, so {G = S_n} with probability {1 - o(1)}, but this remains open, and seems difficult. To prove it (naively), you have to show that the discriminant of the characteristic polynomial is, with high probability, not a perfect square. That should certainly be a low-probability event, but it’s not clear how to prove it!

Why do I need the extended Riemann hypothesis (ERH) and the classification of finite simple groups (CFSG)?

  • ERH: The example above involving factorizing mod {p} is not just useful in practice but also in theory. The prime ideal theorem tells you the statistics of what happens with your polynomial mod {p} for large {p}. In particular, to show that {G} is transitive it suffices to show that {\varphi} has one root on average mod {p}, for large primes {p}. Exactly how large {p} must be is determined by the error term in the prime ideal theorem, and ERH really helps a lot.
  • CFSG: An elaboration of the above strategy is used to show that {G} is not just transitive but {m}-transitive for any constant {m}. By CFSG, {m = 4} and {n > 24} is enough to imply {G \geq A_n}.

Arguably, matrices are not playing a large role in the statement of the theorem: it is just a statement about a random polynomial {\varphi}, where the model happens to be “take a random matrix with independent entries and take its characteristic polynomial”. The theorem should be true for any good model for a random polynomial of high degree. For example, in the “independent (or restricted) coefficients model”, the coefficients of {\varphi} are independent with some fixed distribution. For this model the corresponding statements were proved recently by Bary-Soroker–Kozma (see also Bary-Soroker’s blog post) and Breuillard–Varju, and I certainly borrow a lot from those papers. However, the common ground is isolated to the local–global principle that reduces the problem to a problem modulo {p}, and there the commonality ends. The independent coefficients case becomes a question about a certain kind of random walk (see the Breuillard–Varju paper) mod {p}, whereas the characteristic polynomial case becomes a problem about counting eigenvalues of a random matrix mod {p}.

There are many beautiful questions and mysteries about random polynomials. See for example the many high-definition pictures on John Baez’s website here showing the roots of random polynomials with independent coefficients. In many ways, taking the characteristic polynomial of a random matrix is actually a more natural model of a random polynomial, and some aspects become simpler. For example, I bet that the {A_n} vs {S_n} question will be settled sooner in the case of a random characteristic polynomial than in the case of independent coefficients.