Zeus and Hades write numbers. Nuevo León Olympic Revenge P6.

See what a gem I came across. It’s a pity such kind of problems are scarcely seen at math competitions for students.

Problem 6. Zeus and Hades take turns writing numbers on an infinite board. Starting by Zeus, who writes the number 1, each player can either copy the number previously written, or divide this number by 2 and write it. After an infinite amount of turns, the numbers written on the board form a sequence (a_n){i\in\mathbb{Z}+}. The players then compute the sum

\displaystyle S=\sum_{i=1}^{\infty}a_i.

Zeus wins if this sum converges to a rational number. Otherwise, Hades wins.
Determine if any of the players has a winning strategy, and if so, determine who has it.
Proposed by Éric Hernández.

At each move Hades can make two choices. Vaguely, it means the set of different series, he can impact to be constructed, is non countable, but the rational numbers are. So, perhaps he could escape from all of them. Of course, it’s nonsense, it’s just intuition that turns out to be true. The strategy of Hades is simple. He enumerates the rationals as r_1,r_2,\dots and at each step he takes care to ensure the series cannot converge to r_i,i=1,2,\dots. See that the same approach applies for any countable set of real numbers instead of the rational ones. Note also, there is no number theory here or something that plays with specific nature of the rationals. We only play we the freedom of two choices Hades has to evade consecutively each rational number.


Hades wins. He makes a list of all rational numbers r_1,r_2,\dots and for each r_i,i=1,2,\dots he makes a series of moves to ensure the series cannot converge to r_i. The first rule he complies to is as follows.

i) Hades can write the same number he receives from Zeus at most 2 consecutive times. The third time he is obliged to halve the number.

Further, for i=1,2,\dots Hades does the following procedure. He halves the number Zeus writes, waits for his opponent’s move and suppose Zeus writes some number \varepsilon :=2^{-k}, k\in\mathbb{N}. Let the partial sum of the series at this moment is S'. Hades compares r_i to S'.

1st case. r_i-S'\le (2+\frac{3}{8})\varepsilon. Hades makes the sum of the series to surpass r_i. He writes \varepsilon. Zeus writes a number that is at least \varepsilon/2, Hades again \varepsilon/2, Zeus writes at least \varepsilon/4 and finally Hades adds \varepsilon/8 (the third time he should halve the number). Thus, we came with a sum S'+(2+\frac{3}{8})\varepsilon\ge r_i and it’s guaranteed the series cannot converge to r_i.

2nd case. r_i-S'> (2+\frac{3}{8})\varepsilon. Hades makes a series of moves that guarantees S\le r_i-\varepsilon/4. He writes \varepsilon/2 and halves every subsequent number Zeus puts. Hence, the sum increases with no more than \varepsilon/2+\varepsilon/2+\varepsilon/4+\varepsilon/4+\dots. Apparently it cannot surpass 2\varepsilon. Hades applies to this strategy till the written number drops to \varepsilon/64. After that, he forgets about r_i and proceeds with the next number r_{i+1}. The sum of the series cannot exceed S'+2\varepsilon +\varepsilon/8\le r_i-\varepsilon/4. Indeed, complying only to i), starting from \varepsilon/64 we cannot add more than 4\sum_{j=0}^{\infty} 2^{-6-j}\varepsilon<\varepsilon/8.


IMO 2019 Shortlist, N6. Diophantine Approximations.

Problem N6, ISL 2019. Let H = \{ \lfloor i\sqrt{2}\rfloor : i \in \mathbb Z_{>0}\} = \{1,2,4,5,7,\dots \} and let n be a positive integer. Prove that there exists a constant C such that, if A\subseteq \{1,2,\dots, n\} satisfies |A| \ge C\sqrt{n}, then there exist a,b\in A such that a-b\in H. (Here \mathbb Z_{>0} is the set of positive integers, and \lfloor z\rfloor denotes the greatest integer less than or equal to z.)

About motivation.

I tried to construct a set A of positive integers a_1<a_2<\dots<a_m, as many as possible, and |a_i-a_j|\notin H, 1\le i,j\le m. The most natural algorithm is some sieve method of consecutively selecting a_1,a_2,\dots. Upon choosing a_1 we immediately mark some integers as forbidden, namely all that lie inside the intervals \left[a_1+j\sqrt{2}-1, a_1+j\sqrt{2}\right], j=1,2,\dots , the red segments in the figure 1.
Each subsequent number a_i is chosen to lie outside the union of the red intervals, already constructed, and we add another forbidden intervals for a_i itself (those on figure 1). We hope, the red ink on the Ox axis becomes more and more, until there are no more holes for number selection. It’s natural to estimate how much the red ink increases after each step.


Suppose we have a set A of natural numbers 1\le a_1<a_2<\dots<a_m\le n such that |a-b|\notin H for any a,b\in A. For every i=1,2,\dots,m we construct a set \mathcal{I}_i of intervals \left[a_i+j\sqrt{2}-1, a_1+j\sqrt{2}\right], j=1,2,\dots (the red ones in figure 1 below). Clearly, \displaystyle a_{i+1}\notin \bigcup_{j=1}^i \mathcal{I}_j

Figure 1

It’s natural to estimate how much \bigcup_{j=1}^i \mathcal{I}_j increases after each step. Since the red pattern in figure 1 is periodical with period \sqrt{2} we wind it up on [0,\sqrt{2}] assuming 0 and \sqrt{2} are glued.

For a real number x we denote by t(x) the unique point on \left[0,\sqrt{2}\right) such that x-t(x)=j\sqrt{2} for some j\in\mathbb{Z}. Thus 1,2,\dots,n are mapped to t(1),t(2),\dots,t(n) on \left[0,\sqrt{2}\right]. For any a_i the red pattern on figure 1 is mapped to [t(a_i)-1,t(a_i)], with the convention 0 and \sqrt{2} are glued, i.e. if t(a_i)-1<0 we continue drawing the red pattern jumping from 0 to \sqrt{2} and then to the left (see figure 2).

Figure 2

At each step, the next point t(a_{i+1}) is outside the red set, hence at each step the measure of the red set increases by t(a_{i+1})-t(a_i). We’ll prove, it cannot be too small. We know, t(a_i)=a_i-k\sqrt{2}, t(a_{i+1})=a_{i+1}-\ell\sqrt{2} for some k,\ell\in\mathbb{N}. Thus,

\displaystyle t(a_{i+1})-t(a_i)=a_{i+1}-a_{i}-(\ell-k)\sqrt{2}.

Since \sqrt{2} is an algebraic number of degree 2, Liouville’s theorem (lower bound for diophantine approximation) says, there exists an absolute constant C such that

\displaystyle \left|\sqrt{2}-\frac{p}{q}\right|\ge \frac{C}{q^2}

It implies

\displaystyle t(a_{i+1})-t(a_i)\ge \frac{C}{a_{i+1}-a_i}.

Jensen’s inequality for the convex function \frac{1}{x} yields

\displaystyle \sum_{i=1}^{m-1} t(a_{i+1})-t(a_i)\ge C \frac{(m-1)^2}{a_m-a_1}\ge C\frac{(m-1)^2}{n}.

Finally, if m>C_1\sqrt{n}, for some constant C_1, we obtain \displaystyle \sum_{i=1}^{m-1} t(a_{i+1})-t(a_i)>\sqrt{2}, which shows some a_i should hit the red set, and the result follows. \blacksquare


The same result holds if replace \sqrt{2} with some other algebraic number of degree 2 i.e. a+b\sqrt{c}. The crucial fact exploited is that such numbers have comparatively bad diophantine approximations with rational numbers. In general, it’s not true for any irrational number \alpha in place of \sqrt{2}. The reason is, it may happen at every subsequent step the red set (in the above argument) increases with very small amount allowing us to choose more than O(\sqrt{n}) “good” terms of the sequence.
I also came across opinions that this is not a NT problem, or even it’s a combinatorial one. Most importantly, this is a good problem! Of course, apart from NT stuff, it deals with some estimates, some inequalities, but I don’t see any combinatorics here. Thinking like this, it may turn out that Dirichlet’s approximation theorem is combinatorics because it uses the pigeon hole principle!

IMO 2019 Shortlist: https://www.imo-official.org/problems/IMO2019SL.pdf

AoPS thread of this problem: https://artofproblemsolving.com/community/c6h2279051p17829382

Liouville’s theorem: http://www-users.math.umn.edu/~garrett/m/mfms/notes_2013-14/04b_Liouville_approx.pdf

A Functional Inequality. An Application of Baire Category Theorem.

Here is a question I saw and it interested me.

Problem. Is there a function f:\mathbb R\to\mathbb R^+ such that for any rational number x and any irrational number y, f(x)f(y)\leq|x-y| ?

Let me share something in advance: it’s not an Olympiad problem. It looks like, but it uses something beyond the scope taught in high school. Not that it’s too involved, I think it could be explained to a student with math interests. I tried at the last section of this post. But anyway, Baire category theorem is beyond the high school scope, for sure.

The problem appeared in high school section of a forum, and I began to play with it trying to construct such function. The statement roughly says if x is rational, y – irrational and they are at near distance, then it’s not possible both functional values taken at these points be small. That’s why it’s reasonable to partition the reals into sets where the function takes close values and particularly when these values are small. For example let A_{\varepsilon} be the set of irrationals where f takes values between \varepsilon and 2\varepsilon and r is some rational number. We want to define f(r). Suppose r is at distance d from A_{\varepsilon}. Then we should define f(r) as something about d/\varepsilon. We can do it of course. But what if d=0, i.e. there are points of A_{\varepsilon} in any neighbourhood of r ? In this case it’s not possible to assign a value to f(r) not breaching the given condition. So, it would be okay if any such set A_{\varepsilon} is not “fat”, it shouldn’t be dense in any interval (a,b) because otherwise taking some rational number r\in(a,b) would breach the rule. At this point I realised the Baire category theorem is about to be involved. Here is the solution.


Such function does not exist. Suppose on the contrary, it does exist and consider the sets

\displaystyle A_n := f^{-1}\left ( \left [\frac{1}{n},\infty\right)\right)\bigcap \left (\mathbb{R}\setminus \mathbb{Q} \right), n=1,2,\dots


\displaystyle \bigcup_{n=1}^{\infty}A_n=\mathbb{R}\setminus \mathbb{Q}\quad (1)

Since the set in the RHS of (1) is of second Baire category, it follows (by Baire category theorem, see the next paragraph) that at least one of sets in the LHS of (1), say A_m is not nowhere dense. It means there is an interval (a,b), a<b such that A_m is dense in (a,b). Take some r\in\mathbb{Q}, r\in (a,b). For any x\in A_n it holds

\displaystyle 0<f(r)\cdot \frac{1}{m}\le f(r)f(x)\le |x-r|

Since, we can choose x\in A_n as close to r as we want, we get contradiction.

Baire Category Theorem.

Here we consider only the case the underlying space is \mathbb{R}. It can be generalized for any complete metric space, etc., and the proof is the same. In order to make the statement, we need a couple of definitions.

1) We say a set X\subset \mathbb{R} is dense in some open interval (a,b) if any open subinterval of (a,b) meets X.
2) We say X\subset \mathbb{R} is nowhere dense if there does not exists an interval (a,b) where X is dense. In other words, X is nowhere dense if for any interval (a,b) we can find a subinterval (c.d)\subset (a,b) which does not meet X.

Theorem (Baire category theorem). \mathbb{R} cannot be represent as a countable union of nowhere dense sets.

Before presenting a short proof, let us see that it also implies the set of irrational numbers also cannot be represented as countable union of nowhere dense sets. Indeed, suppose on the contrary, it can and let \mathbb{R}\setminus \mathbb{Q}=\bigcup_{i=1}^{\infty} X_i , where any X_i is nowhere dense. Then

\displaystyle \mathbb{R}=\left(\bigcup_{i=1}^{\infty} X_i\right)\bigcup \left( \bigcup_{i=1}^{\infty}\{r_i\}\right)

where r_1,r_2,\dots is some enumeration of the rationals. Note that all sets in the RHS are nowhere dense, which is contradiction.

Proof of the theorem. Arguing by contradiction, assume

\displaystyle \mathbb{R}=\bigcup_{i=1}^{\infty} X_i

where X_i are nowhere dense sets. Take an interval (a,b). Since X_1 is nowhere dense, we can find an interval [a_1,b_1]\subset (a,b) which is free of points of X_1. With the same argument we can find [a_2,b_2]\subset [a_1,b_1] free of points of X_2. Proceeding in this way, we construct a sequence of nested closed intervals [a_n,b_n]\subset [a_{n-1},b_{n-1}] and the n\,th one is free of points of X_1,X_2,\dots,X_n. Consider the sequences a_1,a_2,\dots and b_1,b_2,\dots. The first one increases, the second – decreases. Furthermore, they are bounded, hence they converge to some limits a, correspondingly b (it may happen a=b). You should believe me for the last sentence, the main reason is that \mathbb{R} has no holes (it’s constructed this way!). Thus, [a,b]\subset [a_n,b_n], n=1,2,\dots, hence it does not meet any of the sets X_i, i=1,2,\dots, which is impossible, because \mathbb{R} is a union of these sets. Contradiction.


IMO 2020, Problem 5. A Deck of Cards.

This is the problem that sparked controversies, questioning is it a number theory problem or a combinatorial one (see [1]). A rather fancy solution is presented, which I made on purpose. The idea is not some new original one (btw all solutions I know, are based on the same idea), it’s just written in a way that separates the NT stuff from so called combinatorial one. So, we first establish two properties that are pure NT, and then use some logical deduction, based on them, to conclude what is needed. I suppose no one would dispute these properties are NT. They are. Okay, let’s first see the problem and the solution. The discussion will be continued.

Problem 5, IMO 2020. A deck of n > 1 cards is given. A positive integer is written on each card. The deck has the property that the arithmetic mean of the numbers on each pair of cards is also the geometric mean of the numbers on some collection of one or more cards.
For which n does it follow that the numbers on the cards are all equal?
Proposed by Oleg Košik, Estonia.


The answer is: for any n. Given two natural numbers a,b we write a\prec b iff any prime factor of a is also prime factor of b. This relation is a preorder: a\prec a holds, and a\prec b,b\prec c implies a\prec c. Obviously

i) a\prec b and a\prec c implies a\prec b\pm c

The property of the statement doesn’t change if multiplying or dividing the numbers on the cards by some natural number, so we can assume the greatest common divisor of the given numbers is 1. Let a_1\ge a_2\ge\dots\ge a_n be those numbers. It easily follows, using the property of the cards

ii) For any 1\le k\le n-1 there exists 1\le \ell\le k such that a_{\ell}\prec a_k+a_{k+1}

Now, ii) yields a_1\prec a_1+a_2, and glancing at i) we obtain a_1\prec a_2. An easy induction, using only i) and ii) (and reflexivity of \prec ) gives a_1\prec a_k, k=1,2,\dots,n. Since \gcd(a_1,a_2,\dots,a_n)=1 it follows a_1=a_2=\dots=a_n=1. \blacksquare


Establishing i) and ii) is pure NT stuff. It doesn’t involve Fermat’s little theorem or Euler’s totient function, but still it uses basic NT facts. Oops, … there is an argument a bit outside NT that is used in ii). Namely, we must know the GM of some numbers is between their minimum and maximum, but it surely is not combinatorics (thou, who knows?), so it’s acceptable! So, we come to the point some conclusions are made based on these two properties. I wouldn’t call them combinatorics! It’s basically logical deductions, it is what math is. So, we can call it mathematics. I could agree to some extent that number theory used is perhaps too weak, but it doesn’t mean the problem is a combinatorial one. It is a good math problem.
I have seen curious reflections in [2], which are borrowed here.
In Olympiad mathematics the topics are defined in the following way. Routine functional equations and inequalities that use at least once Cauchy-Schwarz, kind of AM-GM, Jensen, Karamata, etc., etc. are called “algebra”. Synthetic 2-dimensional Euclidean geometry is called “geometry”, Fermat’s little theorem, Cauchy totient function, and modular arithmetics are “number theory“, and everything else is called … “combinatorics”.
It should be called mathematics instead, but since this word is reserved for the universe, another has been invented.

[1] https://artofproblemsolving.com/community/c6h2278645p17821528
[2] https://artofproblemsolving.com/community/c6h2278645p17876202

IMO 2019 Shortlist, A5. An Application of Divided Differences.

We show a bit different approach on A5 problem of 2019 shortlist. It uses very basic properties of divided differences. We have considered in some blog posts here how finite differences can help in Olympiad problems (see part 1 and part 2). This post is structured as follows. After presenting the text of the problem, a short overview of basic properties of divided differences is given. The solution is at the end and a brief motivation follows. So, let us begin with the problem.

Problem (A5, IMO 2019 SL). Let x_1, x_2, \dots, x_n be different real numbers. Prove that

\displaystyle \sum_{1 \leq i \leq n} \prod_{j \neq i} \frac{1-x_{i} x_{j}}{x_{i}-x_{j}}=\begin{cases} 0 \, , \,n \text{ is even }\\ 1\,,\, n \text{ is odd}\end{cases}

Divided differences.

Definition. Suppose we have some real function f and distinct points x_0,x_1,\dots,x_n. Divide difference f[x_0,x_1,\dots,x_n] taken at these points is defined recursively: f[x_i]:=f(x_i),

\displaystyle f[x_0,x_1,\dots,x_n]= \frac{f[x_1,x_2,\dots,x_n]- f[x_0,x_1,\dots,x_{n-1}]}{x_n-x_0}

For example

\displaystyle f[x_0,x_1]=\frac{f(x_1)-f(x_0)}{x_1-x_0}

\displaystyle f[x_0,x_1,x_2]=\frac{ \frac{f(x_2)-f(x_1)}{x_2-x_1} -\frac{f(x_1)-f(x_0)}{x_1-x_0}}{x_2-x_0}

Property 1. Divided difference can be explicitly written as

\displaystyle f[x_0,x_1,\dots,x_n]=\sum_{i=0}^n \frac{f(x_i)}{\prod_{j\ne i} (x_i-x_j)}

Property 2. If P is a polynomial with degree n-1 then P[x_0,x_1,\dots,x_n]=0.

The next property won’t be used here, but it is somehow an analogue of the Taylor expansion and can be used as interpolation formula (in case P is not a polynomial of degree n).

Property 3 (Newton’s formula). If P is a polynomial of degree n and x_0,x_1,\dots,x_n are any distinct numbers, it can be represented as

\displaystyle P(x)=P[x_0]+P[x_0,x_1](x-x_0)+\dots +P[x_0,x_1,\dots,x_n] (x-x_0)\cdots (x-x_n)

Solution of the Problem.

It’s enough to prove the claim in case non of the numbers x_i,i=1,2,\dots,n is equal to 1 or -1. Indeed, having proven this result and using the continuity of the expression the general claim also follows. Denote

\displaystyle D(x_1,x_2,\dots,x_n):=\sum_{1 \leq i \leq n} \prod_{j \neq i} \frac{1-x_{i} x_{j}}{x_{i}-x_{j}}

It easily follows

\displaystyle D(x_1,x_2,\dots,x_n,1,-1)=-D(x_1,x_2,\dots,x_n)+1+(-1)^{n-1}\quad (1)

Set x_{n+1}:=1, x_{n+2}:=-1 and consider the function

\displaystyle P(t):= \frac{\prod_{j=1}^{n+2}(1-tx_j)}{1-t^2}

Note that P(t) is a polynomial of degree n since the denominator is canceled out with (1-tx_{n+1})(1+tx_{n+2}). We have

\displaystyle D(x_1,x_2,\dots,x_n,x_{n+1},x_{n+2})=\sum_{i=1}^{n+2}\frac{P(x_i)}{\prod_{j\ne i}(x_i-x_j)} +\frac{1}{2}+\frac{(-1)^{n-1}}{2}

because the summands for i=1,2,\dots,n in the RHS are exactly the same as in the LHS and the last two terms are the needed corrections for the summands corresponding to i=n+1 and i=n+2. Note that

\displaystyle \sum_{i=1}^{n+2}\frac{P(x_i)}{\prod_{j\ne i}(x_i-x_j)}=0

since it is a divided difference of the polynomial P(x) (of degree n) taken at the points x_1,x_2,\dots,x_{n+2} (properties 1 and 2 are used). Hence

\displaystyle D(x_1,x_2,\dots,x_n,x_{n+1},x_{n+2})=\frac{1}{2}+\frac{(-1)^{n-1}}{2}

Combining it with (1) we get

\displaystyle D(x_1,x_2,\dots,x_n)= \frac{1}{2}+\frac{(-1)^{n-1}}{2}

and the result follows.


The denominators \prod_{j \neq i} (x_i-x_j)\,,\, i=1,2,\dots,n resemble the explicit representation of a divided difference (see Property 1), so it’s a natural approach to try to find a function f such that

\displaystyle f(x_i)=\prod_{j\ne i}(1-x_{i} x_{j}), i=1,2,\dots,n

so the expresion in the problem statement would be f[x_1,x_2,\dots,x_n]. An obvious candidate is

\displaystyle f(x):= \frac{\prod_{j=1}^{n}(1-x x_{j})}{1-x^2}\quad (2)

Unfortunately, f is not a polynomial (for which finite differences are easier to deal with). But here is the essential moment. If 1 and -1 are present among x_1,x_2,\dots,x_n then f becomes a polynomial of degree n-2 which finite difference, taken at n points, is zero. There is a little caution to be paid in this case when x=\pm 1, but it’s a trifle. In the general case I added 1,-1 as additional points to x_1,x_2,\dots,x_n and saw it disturbed very little the expression. The rest of the job was to adjust things.

Alice’s Map of Wanderland. IMO 2019 Shortlist, Problem C8.

I want to comment the C8 problem of this 2019 IMO Shortlist. It somehow disappointed me. No, I do not say it’s a bad problem, it’s a very good one. But when I saw it enumerated as C8, the first perception was far more promising than it turned out to be.
I wouldn’t provide a full solution in this post, so if someone is reading this only because of the solution, please see the official one, or another one which will be referenced at the end. But first things first. Here is the statement.

Problem C8 (IMO 2019 SL). Alice has a map of Wonderland, a country consisting of n \geq 2 towns. For every pair of towns, there is a narrow road going from one town to the other. One day, all the roads are declared to be “one way” only. Alice has no information on the direction of the roads, but the King of Hearts has offered to help her. She is allowed to ask him a number of questions. For each question in turn, Alice chooses a pair of towns and the King of Hearts tells her the direction of the road connecting those two towns.
Alice wants to know whether there is at least one town in Wonderland with at most one outgoing road. Prove that she can always find out by asking at most 4n questions.

So, translated in graph theory language, we have a tournament G of n vertices and 4n questions are at our disposal in order to establish has the the tournament some property P or not. The property P states: “There is a vertex in G with at most one outbound edge”. It doesn’t sound good for me, so let’s take its negation \overline{P}

\overline{P}: “Every vertex of G has at least two outbound edges.”

At this point, and without any playing around, the problem is promising. You have about n^2/2 edges and you must with some O(n) questions, querying the directions of edges, to establish whether each out-degree is at least 2. Unfortunately, just one move ahead is an idea that shows it can be done with O(n) questions.

Take some vertices a,b and query the direction of the edge connecting them, let it be a\to b. Take another vertex c and query the edge ac. The worst case scenario should be c\to a since if it were a\to c we could have eliminated the vertex a with two questions. By “eliminate” I mean the vertex is of no interest to us anymore, since it has out-degree 2. Note that we have of average 4 questions per vertex to estimate its out-degree. So, suppose the direction is c\to a. Now, after querying bc, we remain with a cycle a\to b\to c\to a in case no vertex of outdegree 2 is revealed. So far, so good. With one question per vertex we have determined 3 vertices, each one with outdegree at least 1. Take another vertex d and question the edges da and db. Then, we for sure reveal a vertex with outdegree 2 among a,b,d. It was about this moment where I lost the enthusiasm with this problem. Thou, the main obstacle is still ahead.

An obvious algorithm is based on the observation just made. Two questions are enough per some vertex among any given four vertices a,b,c,d to unveil an out-degree at least two. I mean we can map the vertex with out-degree 2, say a, to the questions asked about those out-edges. Thus, after consecutive elimination of vertices we are at position with at most three vertices remaining, say a,b,c and all the other vertices with out-degrees at least two. The worst case scenario is when we have a cycle a\to b\to c\to a. To be sure about the out-degrees of a,b,c we must query each of them with all of the remaining vertices. About 3n questions. Another 2n already asked, and we managed to make the distinction with no more than 5n questions.
So, why this problem is C8? There is some difficulty optimizing this algorithm to get 4n questions. But, we managed with the most greedy and dull algorithm to make 5n questions, so perhaps some optimization is not an issue.

There is another point which somehow degrades the merit of the problem. When seeking to estimate the number of steps of some algorithm or any other variable or approximation, we usually don’t care the estimated value is multiplied by a constant. For example, in this case, there is a greedy algorithm that makes it with C\cdot n steps for some absolute constant C. So, we classify it as O(n) time algorithm an it’s enough at least for my worldview. Here, the struggle is to optimize the constant. As it appears, C=4 is the best possible one, as shown in the official SL paper.

One may see the official solution in [1] – an algorithm of 4n time. Pay attention to phase 3. This is what drops the constant from 5 to 4. All the other steps are almost greedy actions. A very similar solution is given in [2]. The essential point lies in step 2.
Still, I think this problem doesn’t deserve C8 position. It is not so difficult. One cannot say for sure, but in my humble opinion, if it had been chosen instead of C5 (IMO 2019 problem 3 – a social network), it would have been solved by far more students than IMO 2019 p3 itself.

[1] https://www.imo-official.org/problems/IMO2019SL.pdf
[2] https://artofproblemsolving.com/community/c6h2278997p17828883

Domination Number of a Graph with Given Minimum Degree.

I plan to discuss a characteristic of a graph, known as “domination number/dominating set”. An Olympiad problem will be discussed and also an interesting result of Alon and Spencer, given in their book “The Probabilistic Method”.

Interpret a graph as a set of cities and some roads between them. The domination number is the minimal number of military squads that can be deployed in the cities, such that for any city either there is a squad in it or there is a squad at one hop from it. That’s, the minimal number of squads needed to control all the cities.
The following problem was given at the finals of a Bulgarian international competition. At first glance it has nothing to do with the concept of domination.

Problem (V International Math Festival, Sozopol (Bulgaria) 2014). It is known that each two of the 12 competitors, that participated in the finals of the competition “Mathematical duels”, have a common friend among the other 10. Prove that there is one of them that has at least 5 friends among the group.

The configuration of the statement can be represented as a graph G with edges corresponding to friendships. Let us translate the statement of this problem with respect to the graph \overline{G} that is the complement of G. The condition that for every two vertices of G there exists another vertex connected to both of them can be worded as follows

(i) For any vertices v_1,v_2 of \overline{G} we have V(\overline{G})\setminus \big(N[v_1]\cup N[v_2]\big)\neq \emptyset, where by N[v] we denote the set of all neighbours of v plus v itself.

Two short definitions in order to proceed further.

Definition. We say a set of vertices v_1,v_2,\dots,v_k of a graph G is dominating set if

\displaystyle \bigcup_{i=1}^k N[v_i]=V(G)

We denote the minimum number of vertices needed to form a dominating set by \gamma(G) and refer to it as the domination number of G. \blacksquare

Now, (i) just means the domination number of \overline G is at least 3. Accordingly, an equivalent version of the problem can be stated:
For any graph \overline G with 12 vertices and with domination number at least three, there exists a vertex with degree at most 6.
Or, the shorter equivalent counterpoint:

Problem (V International Math Festival, Sozopol (Bulgaria) 2014). Any graph with 12 vertices and minimum degree at least 7 has domination number at most 2.

The Result of Alon and Spencer.

A natural question arises. Having a graph G with n vertices and minimum degree \delta to find some upper bound of the domination number of G. Since every vertex covers at least \delta+1 other vertices, one may expect we can find a dominating set of \frac{C}{\delta+1} vertices for some (perhaps very small) constant C. Maybe it can be obtained by choosing the vertices in a way their neighbours do not overlap too much. Unfortunately, we cannot do it for any graph. As shown in [1] there exists a graph with n vertices and minimum degree n/2 with domination number something like \ln n. It means, in some sense, we cannot expect the domination number to be less than \displaystyle C\frac{\ln \delta}{\delta+1}n for some constant C. Here is a result proved independently by Alon and Spencer, Arnautov, and Payan.

Theorem (Alon and Spencer [1], theorem 1.2.2). For any graph G with n vertices and minimum degree \delta

\displaystyle \gamma(G)\le \frac{\ln(\delta+1)+1}{\delta+1}n.

This theorem has a short and nice probabilistic proof, given in the mentioned book of Alon and Spencer. I present it here, but first, let’s see is this result helpful to our problem. Putting n=12,\delta=7 we get \gamma\le 4 which is a rough estimate, we aspire to get \gamma \le 2. Nevertheless, this result is sharp as n\to \infty. For the particular case n=12, \delta=7, the proof of \gamma\le 2 is a bashy casework, as expected. One may find it in [2]. It was the official solution for the problem we started with.

Proof of the theorem. Let p\in(0,1) (it will be determined later). Choose randomly vertices of the graph, each one independently with probability p. Let X be the chosen set and N[X] be the set \bigcup_{x\in X}N[x]. Apparently Y:=\big(V(G)\setminus N[X]\big)\cup X is a dominating set. Let estimate the expected value of the number of elements of Y. The expected value of |X| is np. Fix some vertex v\in V(G) and consider the probability of the event v\in V(G)\setminus N[X]. It happens exactly when neither v nor any neighbour of v is chosen. This probability equals (1-p)^{\deg(v)+1}\le (1-p)^{1+\delta}. Taking into account the linearity of expectation, we get

\displaystyle \mathbb{E}[|Y|]\le np+n(1-p)^{1+\delta}

Using 1-p\le e^{-p}, it yields

\displaystyle \mathbb{E}[|Y|]\le np+ne^{-p(1+\delta)}\quad (1)

We can vary p in (0,1) and the best estimate we can get is when f(p):=np+ne^{-p(1+\delta)} attains its minimum when p\in (0,1). A little bit calculus yields f'(p)=n\left(1-e^{-p(\delta+1)}\right), hence f'(p)=0 for \displaystyle p=\frac{\ln(1+\delta)}{1+\delta} and it’s indeed the minimum of f(p) in (0,1). Putting this value of p we get

\displaystyle \mathbb{E}[|Y|]\le \frac{n\big( 1+\ln (1+\delta)\big)}{1+\delta}

Since the expected value has an upper bounded as shown, there exists some occurence of Y which has at most \frac{n\big( 1+\ln (1+\delta)\big)}{1+\delta} elements. But remember, Y is a dominating set. Hence

\displaystyle \gamma(G)\le \frac{n\big( 1+\ln (1+\delta)\big)}{1+\delta}

[1] Noga Alon, Joel Spencer, The Probabilistic Method. Wiley, 3rd ed.
[2] https://artofproblemsolving.com/community/c6h1934627

IMO 2019 Shortlist, problem A4.

Problem. (A4, IMO SL, 2019) Let n\ge 2 be a positive integer and a_1,a_2,\dots, a_n be real numbers such that


Define the set A by

A:= \{(i,j) : 1\le i<j\le n, |a_i-a_j|\ge 1\}.

Prove that, if A is non empty, then

\displaystyle \sum_{(i,j)\in A} a_ia_j <0.

Solution. We may assume A:= \{(i,j) : 1\le i,j\le n, |a_i-a_j|\ge 1\}, that’s if (i,j)\in A then (j,i)\in A. We also assume a_i\ne 0, i\in[1..n], since removing zeroes doesn’t change anything. Denote by A' the complement of A in [1..n]\times [1..n] , i.e.

A' = \{(i,j) : 1\le i,j\le n, |a_i-a_j|< 1\}

It’s enough to prove

\displaystyle \sum_{(i,j)\in A'} a_ia_j >0\quad (1)


\displaystyle \sum_{(i,j)\in A} a_ia_j=(a_1+a_2+\dots+a_n)^2-\sum_{(i,j)\in A'} a_ia_j= -\sum_{(i,j)\in A'} a_ia_j

Consider the sets I_{-}:=\{i\in[1..n] : a_i<0, \exists j\in[1..n], a_j>0\text{ and }|a_i-a_j|<1\} and I_{+}:=\{i\in[1..n] : a_i>0, \exists j\in[1..n], a_j<0 \text{ and }|a_i-a_j|<1\}. It easily follows that if i_1,i_2\in I_- then |a_{i_1}-a_{i_2}|<1 implying (i_1,i_2)\in A'. Anaglagously, i_1,i_2\in I_+ implies (i_1,i_2)\in A'. We have:

\displaystyle \sum_{(i,j)\in A'} a_ia_j \ge \sum_{i,j\in I_-} a_ia_j+\sum_{i,j\in I_+} a_ia_j +2\sum_{i\in I_-,j\in I_+} a_ia_j=\left(\sum_{i\in I_-\cup I_+} a_i\right)^2

Note that we have equality in the above expression only when I_-\cup I_+=[1..n] and |a_i-a_j|<1, \forall i,j\in I_-\cup I_+=[1..n]. In this case the set A is empty, which is ruled out. Thus, it yields

\displaystyle \sum_{(i,j)\in A'} a_ia_j >0

and the result follows.

A Family of Sets. Miklós Schweitzer 2010, Problem 3.

Problem 3 (Miklós Schweitzer 2010). Let A_i,i=1,2,\dots,t be distinct subsets of the base set \{1,2,\dots,n\} complying to the following condition

\displaystyle A_ {i} \cap A_ {k} \subseteq A_ {j}

for any 1 \leq i <j <k \leq t. Find the maximum value of t.

Solution. Denote

\displaystyle D_1:=A_1; D_i:= A_i\setminus (A_1\cup A_2\cup\dots\cup A_{i-1}), i=2,3,\dots,t

\displaystyle A_{i,j} := D_i\cap A_j, 1\le i\le j\le t.

We have A_j=A_{1j}\cup A_{2j}\cup\dots\cup A_{jj}. The given condition yields

1) A_{ii}, i=1,2,\dots,t are disjoint sets.
2) A_{ij}\supset A_{ik} for any i\le j\le k\le t.

Let I:=\{1\le i\le t : A_{ii}\neq \emptyset\} and I consists of the indices \{n_i:i=1,2,\dots,s\} where 1\le n_1<n_2<\dots<n_s\le t. By 2) and since A_i are distinct sets, we have

\displaystyle \bigcup_{j=1}^{k} A_{j k} \supsetneq \bigcup_{j=1}^{\ell} A_{j \ell}

for each n_i\le k<\ell<n_{i+1}. These observations allow us to fully characterize such possible sequences of sets. Each such sequence of sets can be obtained applying the folloowing algorithm.
We start from a set A_1\subset [1..n] (A_1 may be empty) and a pool of elements [1..n]\setminus A_1. At each step i=2,3,\dots,t the set A_i is obtained from the set A_{i-1} by applying at least one of the following two operations (or both)

1) Remove from A_{i-1} some of its elements.
2) Add to A_{i-1} some elements from the pool (the pool is corrected accordingly).

Each element of [1..n] can be added and removed at most once, and the empty set can appear at most once. Hence, t\le 2n. It’s attained by the following configuration of 2n sets.

\displaystyle A_0:=\emptyset, A_i:=[1..i], i=1,2,\dots,n; A_{n+i}:=[i+1..n], i=1,2,n-1.


Numbers in a table. Bundeswettbewerb Mathematik, 2020, II round.

Here is the story. I saw a problem in a Bulgarian math forum. Not so difficult, but very pleasant. One of those, intuitively obvious problems with simple statements, for which the right path sometimes eludes. I posted a short solution on that forum and decided to put it also here in my blog. The next day I saw the same problem in AoPS and gave a link. After a short period of time I was notified it was from a still running competition, Bundeswettbewerb Mathematik, a federal mathematics competition in Germany. That post in AoPS was deleted and I was also asked to delete my blog post, which I did. After all, there was a competition still active with the same problem on the list. Cheaters are part of the world and always have been! Nevermind, the deadline of that competition was September 1st, so, now we can comment on it.

Problem. Given a m\times n grid board, n>m, a non negative number is written in each of its cells. Each column contains at least one positive number. Prove that there exists a square in the grid, with positive integer inside it, for which the sum of the numbers in its row is greater then the sum of the numbers in its column.

Since there are more columns than rows, there is a row with larger sum of cells than the sum of cells in some column. The problem is, they may intersect at an empty cell (with zero in it). But still, intuitively one is pretty convinced the claim holds. As a motivation, it’s instructive to consider the particular case when the numbers are just zeros and ones.
Another wording of the same claim: A bipartite graph G(A,B) with |B|>|A| is given. To each edge is assigned a positive number. Prove that there exists an edge a\,b, a\in A, b\in B such that the sum of numbers on the edges incident with a is greater than the sum of numbers on the edges incident with b.
Here A corresponds to the rows of the table, B corresponds to its columns, any edge can be interpreted as the cell on the intersection of corresponding row and column. Not that this model is easier to be considered, but at least it gets rid of the empty squares (containing a zero).


We delete the rows containing only zeros, thus we can assume in any row/column there exists a positive real number. Denote the numbers by c_{ij}, i=1,2,\dots,m, j=1,2,\dots,n. We put another two numbers a_{ij},b{ij} in every cell, defined as follows.

\displaystyle a_{ij}:=\frac{c_{ij}}{\sum_{k=1}^m c_{kj} }

\displaystyle b_{ij}:=\frac{c_{ij}}{\sum_{k=1}^n c_{ik} }

Clearly, \sum a_{ij}=n, \sum b_{ij}=m, hence \sum a_{ij}>\sum b_{ij}. It means there is a cell (i,j), with a_{ij}>b_{ij}. By the definition of a_{ij},b_{ij} it yields

\displaystyle \sum_{k=1}^m c_{kj}< \sum_{k=1}^n c_{ik}