The avoidance density of (k,l)-sum-free sets

A set {S \subset\mathbf{N} = \{1, 2, \dots\}} is called sum-free if it contains no solutions to {x+y=z}. For example, the set of odd integers is sum-free, as is the set {\{51, 52, \dots, 100\}}. Generalizing these two examples, for every {\theta \in \mathbf{R}/\mathbf{Z}} (real numbers mod {1}), the “middle third set”

\displaystyle S_\theta = \{x \in N: \theta x \in (1/3, 2/3) \pmod 1\}

is sum-free (the odd integers is precisely {S_{1/2}}, while the interval {\{51, \dots, 100\}} is a subset of {S_\theta} for {\theta = 1/150 - \epsilon}). As observed by Erdős, for any set {A \subset \mathbf{N}}, one of the sets {S_\theta \cap A} is large, because

\displaystyle  \int |S_\theta \cap A| \,d\theta = |A|/3.

Therefore every set of {n} integers contains a sum-free subset of size at least {n/3}.

Suppose more generally you have some type of configuration {\mathfrak{C}} (being deliberately vague about allowed generality), and you are interested in finding {\mathfrak{C}}-free subsets of arbitrary sets. The avoidance density of {\mathfrak{C}} is the largest constant {\delta} such that every set of {n} positive integers has a {\mathfrak{C}}-free subset of size at least {\delta n}. Here are some examples.

  1. The equation {x - y = 1} has avoidance density {1/2}: clearly any subset of a long interval contains at most half-sized subsets without adjacent elements, and conversely I can always take either the odds or the evens to get an adjacency-free subset of at least half the size. The equation {x - y = 2} also has avoidance density {1/2}, as does {x - y = c} for any constant {c}. (Partition into arithmetic progressions of common difference {c} and take every other element to find your large subset. For an example of a set with no large {\mathfrak{C}}-free subsets, take an arithmetic progression of common difference {c}.)
  2. For any rational {a \neq 1}, the equation {ax = y} has avoidance density {1/2} (this is a similar easy exercise).
  3. Suppose {\mathfrak{C}} represents {3}-term arithmetic progressions. Then a {\mathfrak{C}}-free subset is a set without {3}-term arithmetic progressions, and by Roth’s theorem the set {\{1, \dots, n\}} does not have any positive-density {\mathfrak{C}}-free subsets, so the avoidance density of {\mathfrak{C}} is zero. Similarly, the avoidance density of {k}-term progressions is zero, by Szemerédi’s theorem. More generally, any sort of finite configuration which is preserved under affine maps has avoidance density zero, because you can always find it inside a sufficiently long arithmetic progression.

Erdős’s argument shows that {x+y=z} has avoidance density at least {1/3}. A few years ago, Ben Green, Freddie Manners and I showed that the avoidance density of {x+y=z} is equal to {1/3}: that is, for every {\epsilon>0} we constructed a set of {n} integers with no sum-free subset of size larger than {(1/3 + \epsilon)n}.

The next step is to look at {\mathfrak{C}_{k,l} : x_1 + \cdots + x_k = y_1 + \cdots + y_l} for arbitrary {k > l}. A {\mathfrak{C}_{k,l}}-free set is sometimes called {(k,l)}-sum-free. The natural conjecture is that the avoidance density of {\mathfrak{C}_{k,l}} is {1/(k+l)}, as this is what the natural adaptation of Erdős’s argument shows. I was able to prove this in two big cases: in the case {k \leq 3l}, by generalizing the previous method, and in the case {l = 1}, using a new argument based on transference and a beautiful theorem of Łuczak and Schoen.

Today a paper appeared on the arxiv by Yifan Jing which settles this question for all {k, l}. Thus the avoidance density of {\mathfrak{C}_{k,l}} is indeed {1/(k+l)}! To do this Jing extended the Łuczak–Schoen theorem to {l > 1}, proving the following theorem (Lemma 5.4 in Jing’s paper).

Theorem 1 Every {(k,l)}-sum-free subset of {\mathbf{N}} of upper density greater than {1/(k+l)} is contained in a periodic {(k,l)}-sum-free set.

What this means is that {(k,l)}-sum-free sets of density greater than {1/(k+l)} generally have to be structured. On the other hand, if {\delta} is the avoidance density of {\mathfrak{C}_{k,l}}, then we can create random-looking {(k,l)}-sum-free subsets of density nearly {\delta}. In particular, using a transference tool (namely, Loeb measure), we can create random-looking infinite {(k,l)}-sum-free subsets of upper density nearly {\delta}. Therefore we must have {\delta \leq 1/(k+l)}.

Jing also looked the lower bound. Famously Bourgain did some work on this, showing that every set of {n} positive integers has a sum-free subset of size at least {(n+2)/3}, emphasis on the {2}. With a more interesting argument Bourgain showed that there is a {(3,1)}-sum-free set of size at least {n/4 + c \log n / \log \log n}. Jing adapted this argument for many pairs {(k,l)}, though doing so for {(k,l) = (2,1)} remains a stubborn open problem. It is interesting to see how Bourgain’s method adapts, and I look forward to thinking more about this.

It is all-too-easy to pose difficult open questions in this circle of ideas, but I will risk two.

  1. This question is posed by Jing. Let {\square^d} represent {d}-dimensional projective cubes. That is, {\square^2} represents the sum configuration {\{x, y, x+y\}}, {\square^3} represents {\{x, y, z, x+y, x+z, y+z, x+y+z\}}, etc. What is the avoidance density of {\square^d}?
  2. What is the avoidance density of {2x+y = z}? Or for that matter any equation without only {\pm1} coefficients?

One general theorem that comes out of the transference method is the following. Say that {\mathfrak{C}} is dilation-invariant if whenever {C \in \mathfrak{C}} we have also {\lambda C \in \mathfrak{C}} for every {\lambda \in \mathbf{N}}. (Everything we have mentioned is dilation-invariant, apart from {x-y = c}.)

Theorem 2 Let {\mathfrak{C}} be any dilation-invariant configuration. Then the avoidance density of {\mathfrak{C}} is the same as the largest multiplicative upper density of an infinite {\mathfrak{C}}-free subset of {\mathbf{N}}.

Here multiplicative upper density is defined as the asymptotic upper density with respect to your favourite multiplicative Følner sequence, such as

\displaystyle  A_n = \{p_1^{e_1} \cdots p_n^{e_n} : 0 \leq e_i < n\},

where {p_1, p_2, \dots} are the primes in any order. Therefore questions 1 and 2 ask for the largest multiplicative upper density of {\square^d}-free or {2x+y=z}-free subset of {\mathbf{N}}, respectively.

Here is my general conjecture.

Conjecture 3 Let {\mathfrak{C}} be any type of configuration described by a single equation {a_1 x_1 + \cdots + a_k x_k = 0}, with {a_1, \dots, a_k \in \mathbf{Z}\setminus\{0\}}. Then the avoidance density of {\mathfrak{C}} is the same as the measure of the largest {\mathfrak{C}}-free subset of {\mathbf{R}/\mathbf{Z}}.

One might also conjecture this for higher-complexity systems like {\square^3}, but I have no idea whether that is likely to be true! Certainly the lower bound holds: if {S \subset \mathbf{R}/\mathbf{Z}} is {\mathfrak{C}}-free then the avoidance density of {\mathfrak{C}} is at least {\mu(S)}. But even if this is true it’s not very useful! What is the largest {\square^3}-free subset of {\mathbf{R}/\mathbf{Z}}? What is the largest {2x+y=z}-free subset of {\mathbf{R}/\mathbf{Z}}? These questions are simpler, certainly, but still not at all obvious!

Edit: The largest {\mathfrak{C}}-free sets {S \subset \mathbf{R}/\mathbf{Z}} I know in these two cases are the following. I have made very little effort to look for other solutions. Can you spot a better solution? Is there a systematic way of solving this problem?

{\mathfrak{C}} {S} {\mu(S)}
{\square^3} {(1/4, 3/4)} {1/2}
{2x+y=z} {(1/8, 3/8)} {1/4}

Later edit: Here are some variants of the {\square^3} question, essentially due to Cosmin Pohoata:

  1. Suppose {A} is a {\square^3}-free subset of {\{1, \dots, n\}}, e.g., {\{x \equiv 1,2 \pmod 3\}}, or {(n/3, n]} (both of these are {S_\theta} for some {\theta}). Is it true that {|A| \leq (2/3 + o(1))n}?
  2. Suppose {A} is a {\square^3}-free subset of {\mathbf{Z}/2^k\mathbf{Z}}, e.g., {\{x \equiv 1, 4 \pmod 8\}}. Is it true that {|A| \leq (5/8 + o(1))2^k}?

The interesting point here is that shoehorning the set into {\mathbf{Z}_2} appears to come with a density hit (as with sum-free sets), and the extremal set appears not to be of the form {S_\theta}.

Leave a comment