The apparent structure of dense Sidon sets

“What are dense Sidon sets of {1, …, n} like?”, asked Tim Gowers on his blog almost ten years ago. A Sidon set is a set without any solutions to $x+y=z+w$, which in additive combinatorics jargon means that it has minimal additive structure. Almost paradoxically, large sets with this property appear to be structured in another way, and that’s a bit of a mystery currently.

Now Freddie Manners and I have an idea about the answer to Gowers’s question, at least if “dense” means “really really dense, like 99% as dense as possible”, and the setting is a finite abelian group like $\mathbf{Z}/n\mathbf{Z}$ instead of $\{1, \dots, n\}$. Our suspicion is that any such set must be related in a specific way to the collineation group of a finite projective plane.

We uploaded our paper to the arxiv today, so have a look and tell us what you think! I also spoke about this recently at CANT 2021. The recording is available here: https://youtu.be/s4ItIkkUvF4

On the other hand, there is also some evidence pointing the other way. Forey and Kowalski showed recently that certain moderately dense Sidon sets arise from algebraic geometry, not from projective planes. It is not clear whether such sets reach the “really really dense” threshold; if so, that would contradict our conjecture, but I suspect they don’t.

The pancake sorting problem is to determine the diameter of the pancake graph, the Cayley graph of ${S_n}$ with respect to the generating set ${P_n = \{p_1, \dots, p_n\}}$, where ${p_k}$ is defined (in two-line notation) by

$\displaystyle p_k = \left(\begin{array}{ccccccc} 1 & 2 & \dots & k & k+1 & \cdots & n \\ k & k-1 & \dots & 1 & k+1 & \cdots & n \end{array}\right)$

(i.e., ${p_k}$ flips the top ${k}$ pancakes). It is not hard to see that the diameter is between ${n}$ and ${2n}$. Both lower and upper bound have been sharpened by a constant factor, and a natural conjecture is that ${\mathrm{diam}~\mathrm{Cay}(S_n, P_n) / n}$ converges to some intermediate constant (to be determined, ideally).

Here is a natural model problem: estimate ${|P^m|}$ for each fixed ${m}$ as ${n \to \infty}$. Clearly ${|P^m| \leq n^m}$, and we should be able to detect some fall-off from this trivial upper bound.

Proposition 1 There is an integer sequence ${(a_m)}$, beginning

$\displaystyle 1, 2, 6, 22, 90, 486, 3014, 22185, 182229, \dots$

(not in the OEIS) such that ${|P^m| \sim (a_m/m!) n^m}$ as ${n \to \infty}$ with ${m}$ fixed.

To define ${(a_m)}$ it is useful to introduce the hyperoctahedral group ${B_m = \{\pm1\} \wr S_m}$ (cf. burnt pancake sorting). You can think of ${B_m}$ as the group of “signed” or “antipodal” permutations: permutations ${\sigma}$ of ${\{\pm1, \dots, \pm m\}}$ such that ${\sigma(-x) = -\sigma(x)}$ for all ${x}$. There is a natural cover ${B_m \to S_m}$ defined by forgetting signs. We lift ${P_m}$ to ${B_m}$ in a particular way: define ${Q_m = \{q_1, \dots, q_m\}}$, where

$\displaystyle q_k = \left(\begin{array}{ccccccc} 1 & 2 & \dots & k & k+1 & \cdots & n \\ -k & -(k-1) & \dots & -1 & k+1 & \cdots & n \end{array}\right).$

Now define

$\displaystyle a_m = \# \{ q_{\sigma(1)} \cdots q_{\sigma(m)} : \sigma \in S_m \}.$

I computed the initial terms in the sequence using this simple formula. The first nontrivial relations are

$\displaystyle q_1 q_2 q_4 q_3 = q_2 q_4 q_3 q_1$

and its inverse (to verify I recommend using four asymmetric cards from a deck of cards), which together account for ${a_4 = 22 < 24}$.

To sketch the proof of Proposition 1, consider composing ${m}$ random elements of ${P_n}$. You can think of this as a two-step process: randomly divide the stack ${\{1, \dots, n\}}$ in ${m}$ places, and then rearrange the resulting ${m+1}$ pieces, possibly with some reversals. With high probability the points of division are all distinct, so they can be inferred from the result (this is true for ${m}$ up to about ${n^{1/2}}$). The only question is whether the order can be inferred, and this is precisely what is defined by ${a_m}$.

It would be informative for the pancake problem to know more about the sequence ${(a_m)}$. I don’t know whether ${a_m}$ can be computed more efficiently than the naive way above, e.g., whether there is a recurrence: if so that would really help a lot.

There is at least a submultiplicativity property that follows from the proposition:

Corollary 2 For ${m, k\geq 1}$, ${a_{m+k} / (m+k)! \leq a_m / m! \cdot a_k / k!}$. Hence there is a constant ${c_1 > 0}$ such that

$\displaystyle a_m/m! < e^{-c_1m + o(m)}.$

Proof: This follows from ${|P^{m+k}| \leq |P^m| |P^k|}$, dividing by ${n^{m+k}}$, and taking the limit. The second part follows from the subadditivity lemma and the data point ${a_4 = 22 < 24}$. $\Box$

I think a stronger bound should be true: there should be a constant ${c_2 > 0}$ such that

$\displaystyle a_m < m!^{1-c_2 + o(1)}.$

(One might then contemplate whether ${n/(1 - c_2) + o(n)}$ is the diameter of the pancake graph, but that seems a little bold at this stage.)