Independent Vertex Sets in a Graph.

A problem given at a 2022 Bulgarian tournament will be commented on. As I learnt, it was given also in a similar Russian tournament. The idea used is instructive and we finish the arguments using the probabilistic method.

Problem. A graph G with n vertices is given, and x of its edges are colored red so that each triangle has at most one red edge. The maximum number of vertices in G that induce a bipartite graph equals y. Prove that n\ge 4x/y.

Solution. Let v_1,v_2,\dots,v_n be the vertices of G, and x_i be the number of red edges incident with the vertex v_i, i=1,2,\dots,n. Denote by N_r(v_i), i=1,2, the set of vertices connected to v_i through red edges. Apparently x_1+x_2+\dots + x_n=2x.

The main observation. Any two vertices in N_r(v_i) are not connected, that is, each N_r(v_i), i=1,2,\dots,n is an independent set. Indeed, if some two vertices in N_r(v_i) are connected we obtain a triangle with two red edges, which is ruled out.

The fancy statement about the maximum induced bipartite graph is just equivalent to show that there exist two independent vertex sets A,B with

\displaystyle |A\cup B|\ge\frac{4x}{n}.

Plausible candidates of these independent sets are N_r(v_i), i=1,2,\dots,n. Thus, it’s enough to prove that there exist 1\le i,j\le n with

\displaystyle |N_r(v_i)\cup N_r(v_j)|\ge \frac{4x}{n}=\frac{2(x_1+x_2+\dots+x_n)}{n}\qquad (1)

It’s quite intuitive. Indeed, since x_i=|N_r(v_i)|, we have

\displaystyle \max_{1\le i\le n} |N_r(v_i)|\ge \frac{x_1+x_2+\dots+x_n}{n}.

So, if there exists one independent set with half of the required elements, there will most likely be two of them with twice the number of elements!? At this point it’s tempting to forget about the fact that these sets are related to the structure of the graph and prove it for any sets. Unfortunately, it’s not true for arbitrary sets. But fortunately, some of the sets N_r(v_i), i=1,2,\dots,n do not intersect with each other.

The second observation. If v_i and v_j are connected then N_r(v_i)\cap N_r(v_j)=\emptyset.
Indeed, if there exists v_k\in N_r(v_i)\cap N_r(v_j) we would get a triangle v_k,v_i,v_j with two red edges, v_iv_k and v_jv_k.

We prove even something stronger. The condition (1) holds for some vertices v_i,v_j connected with a red edge. Take randomly a red edge and number its two ends randomly as 1st and 2nd. Let R_1 be a random variable that equals the number of red edges incident with the 1st end and R_2 – a random variable equal to the number of red edges incident with the 2nd end. Since the number of all red edges is (x_1+x_2+\dots+x_n)/2 and there are two possibilities to enumerate the ends of each one, we obtain.

\displaystyle \mathbb{E}[D_1]=\sum_{i=1}^n \frac{x_i}{x_1+x_2+\dots+x_n}\cdot x_i = \frac{\sum_{i=1}^n x_i^2}{\sum_{i=1}^n x_i}

The same holds for the expectation of D_2, hence by linearity of expectation,

\displaystyle \mathbb{E}[D_1+D_2]=\frac{2\sum_{i=1}^n x_i^2}{\sum_{i=1}^n x_i}.

By AM-QM inequality x_1^2+x_2^+\dots x_n^2\ge (x_1+x_2+\dots x_n)^2/n and we get

\displaystyle \mathbb{E}[D_1+D_2] \ge \frac{2\sum_{i=1}^n x_i}{n}

This means that there exists a red edge connecting vertices v_i and v_j for which (1) holds.

A Quick Review on Graph Connectivity. A 2022 Korea Test Problem.

Connected graphs, cut-vertices, k-connected graphs, … these notions are part of the variety of problems given at Math Olympiads. I plan to shed some light on these topics – definitions and core theory needed. A problem given at the 2022 Korea winter program practice test will be solved as an application of the theory.

Definitions and useful facts.

Definition. A graph is connected (or 1-connected) if for any two of its vertices there is a path that connects them. Any graph G can be split into connected subgraphs (components) G_1,G_2,\dots,G_k, such that there is no edge between G_i and G_j, i\neq j.
A graph is \mathbf{k}connected (k>1) if removing any k-1 of its vertices leaves graphs still connected. Here we consider only 2-connected graphs, i.e. removing any vertes of the graph leaves it still connected.

Some useful facts. Assume a graph is 2-connected. Then for any two of its vertices, there is a cycle that passes through them. And vise versa, if a graph has that property it is 2-connected. On this base, there is a simple construction that enables us to obtain any 2-connected graph. Start from a cycle and at every step add to the already constructed graph H a path v_0,v_1,\dots,v_k, k\ge 1 where v_0 and v_k are different vertices in H and v_1,\dots,v_{k-1} are new vertices (the set of which may be empty – in this case we add just an edge that connects two existing vertices of H).
Proofs of those facts can be seen in [1] (chapter 3.1). The handout in [2] contains some applications on this topic (and many others).
The last claim is useful when we have to prove something about a 2-connected graph and want to induct on the number of edges.
As said, any graphs is a union of its connected components. As a parallel, any connected graph is a structure that includes (in a sense) its 2-connected components.

Definition. A vertex v of a connected graph is called a cut-vertex (or articulation point) if deleteing v makes the graph disconnected. Two examples on the picture below – the red vertices are cut-vertices. The red edge on the second graph is called a bridge – upon removal of an edge like that the graph becomes disconnected.

Figure 1.

Definition. A maximal 2-connected subgraph of a connected graph G is called a block.

Look again on fig.1. The first graph has two blocks that are connected via the red cut-vertex. The second graph has three blocks – the bridge with its two ends is also a block. Intuitively, any connected graph can be viewed as its blocks that overlap at its cut-vertices. Moreover, this structure resembles a tree, as we will see.
The next figure (fig.2) shows a graph, its blocks and its cut-vertices – the red dots.

Figure 2.

Let us replace each block with a large white dot, and connect each cut-vertex (red points) to every block it belongs to. The result is illustrated on fig.3.

Figure 3.

Claim. The resulting graph is a tree. It’s called the block tree of the initial graph.
The proof is short, and also can be found in [1]. Keep in mind that each cut-vertex is part of the blocks it’s adjacent to in the block tree (red points on fig. 3). I think the mentioned theory is enough to proceed to the next section.

2022 Korea winter test problem.

Problem (2022 Korea winter test). Let n\ge 2 be a positive integer and S be a set of 2n airports. For two arbitrary airports A,B, if there is an airway from A to B, then there is an airway from B to A. Suppose that S has only one independent set X of n airports. Prove that there exists an airport P\in S\setminus X which satisfies following condition.
Condition : For each two distinct airports A,B\in S\setminus {P}, if there exists a path connecting A and B, then there exists a path connecting A and B which does not pass through P.

So, in another words, we have a simple graph G(V,E),|V|=2n and X\subset V is the only independent set with n vertices. We have to prove there exists v\in V\setminus X which is not a cut-vertex.
Let G(V_1),G(V_2),\dots,G(V_k) be all connected components of G. There is at least one V_i with |X\cap V_i|\le |V_i|/2. We denote V':=V_i, X':=X\cap V' and G' be the graph induced by V'. Then X' is the only independent set with |X'| vertices in G'. Further we work only with G'. The plan is to prove via induction on the number of blocks of G' the following claim. It involves a certain amount of casework, but it unfolds naturally.

Claim. Let G(V,E) be a connected graph with an unique independent set X of m vertices, and m\le |V|/2. Then there exists a vertex v\in V\setminus X which is not a cut-vertex.

Proof. Note first that if X is an unique independent set with some number of elements, it must be the maximum independent set.
Induction on the number of blocks of G. In case G contains only one block, that is, G is 2-connected, then all of its vertices are not cut-vertices.
Suppose now, G has multiple blocks. Let B be a block that is a leaf of the block tree – see figure 3. If B consists of at least 3 vertices, then there are two of them that are not cut-vertices and are connected, see fig.4.

Figure 4.

They cannot be both in X, and we are done. Let now B consists of less than 3 vertices. This means B is a bridge – v_1\,v_2 on figure 5.

Figure 5.

In case v_2\notin X we are done, so assume v_2\in X. Then v_1\notin X. Further, suppose v_1 is also connected to the blocks B_1,B_2,\dots,B_{\ell} in the blocks tree. We remove the vertices v_1 and v_2 and get a graph which consists of \ell connected components G_1.G_2,\dots,G_{\ell}, each one contains respectively the blocks B_1,B_2,\dots,B_{\ell}. Each of the graphs G_1,\dots,G_{\ell} has unique maximum independent set, denote it by X_i, i=1,2,\dots,\ell, and X=X_1\cup X_2\cup\dots X_{\ell}\cup\{v_2\}. There are two possible cases. Either there exists a graph G_i satisfying |X_i|<|V(G_i)\setminus X_i| or |X_i|=|V(G_i)\setminus X_i|, \forall i=1,2,\dots,\ell. Further we consider each of these cases.

1-st case. There is a graph, say G_1, that satisfies |X_1|<|V(G_1)\setminus X_1|.

If v_1 is connected to at least two of the other vertices of the block B_1, we apply the induction hypothesis to G_1 and are done, since removing a vertex from G_1 that does not break G_1 will also not break G.
Assume, v_1 is connected to only one vertex, say v_3 in B_1 – fig.6

Figure 6.

This means B_1 is a bridge consisting of v_1,v_3. If v_3\in X we are done again, applying the induction hypothesis to G_1. Assume v_3\notin X. If v_3 is connected to more than one block in G_1 we can apply again the induction with respect to G_1 and get a point v'\in G_1, v'\notin X taht can be removed leaving G_1 still connected. That point v' cannot be v_3, because it would disconnect G_1. So, in this case we are done again. Suppose, finally v_3 is connected to only one block, say B' (apart from B_1). Note that upon removing v_3 from G_1 the resulting graph again satisfies the hypothesis of the claim. In case v_3 is connected to more than one vertex in B' we are done again, for the same reasons. This means B' should be again a bridge, that is, v_3 is connected to only one vertex v_4\in B' but this time v_4 must be in X, because otherwise we can increase the maximu independent set in G by adding v_3 to X (v_1\notin X). Therefore, by applying the induction to G_1-v_3 we get the needed vertex.

2-nd case. All the graphs G_1,G_2,\dots,G_{\ell} satisfy the conditions of our claim. In this case, the picture is something like on figure 7.

As explained above (fig. 6), if one of vertices v'_1,v'_2,\dots,v'_{\ell} belongs to X we are done. Assume all these vertices are not in X. But, in this case we can remove v_2 from X and put in its place the vertex v_1, thus getting another independent set with the same number of elements. It means this case is impossible.

To recap. In any possible case we either used the induction hypothesis or found the needed vertex explicitly.

[1] R. Diestel, Graph Theory, 3-rd edition.
[2] Dragomir Grozev, Olimpiad Graph Theory, a handout.
[3] AoPS’s thread on the problem.

Olympiad Graph Theory.

I was delighted to be invited to give a lecture at our National team camp for the 2022 International Math Olympiad. I was told it had to be on Graph Theory and I was absolutely free to include whatever I wanted. That’s how the paper linked below came about – I wanted to give the students something written besides the lecture and exercises that I did. I checked a lot of problems from IMO’s, Shortlists, NMO’s, etc., most of which I have solved on the AoPS forum or commented on on this blog over a period of time. The needed theory was also included. I spent time on wording and style, but not much. Thus, some of it might be raw, so please, inform me if there are some typos or issues. The handout is intended for students who already have basic knowledge on Graph Theory. Most of the examples included are from prestigious math competitions or their shortlist.

Olympiad Graph Theory.

Three Graph Problems on IMO 2022?! Problem 3.

In some recent posts we commented on two problems of this year’s International Math Olympiad (see [1] and [2]). Both are excellent, with fresh ideas. I like the trend of the problems chosen in recent IMOs – they are not pure combinatorics or number theory, algebra, etc., they have elements from different subjects. It favors creative thinking, instead of applying well known techniques. The problem, I am going to present, is a wonderful example illustrating this fact. The statement is about a certain graph constructed on the primes as its vertices. It has two parts, the number theory part helps to establish two important properties of the graph. The rest is pure combinatorics.

Problem (IMO 2022, problem 3). Let k be a positive integer and let S be a finite set of odd prime numbers. Prove that there is at most one way (up to rotation and reflection) to place the elements of S around the circle such that the product of any two neighbors is of the form x^2+x+k for some positive integer x.

A quick comment first. The claim is also true if the condition the primes being odd is omitted. It’s given just for convenience. In fact, we will not use this hypothesis, at least not explicitly. Indeed, x^2+x is even, so if k is odd and p,q\in S, pq=x^2+x+k then p,q must be odd. If k is even than pq is even which means one of them is 2 and in this case the elements of S cannot be disposed around a circle in the desired way.

So, translated into graph theory language it looks like:
We have an (infinite) graph G with vertices all odd prime numbers. Two vertices p and q are connected if and only if there exists a positive integer x such that pq=x^2+x+k. We want to prove that any finite subgraph H of G contains at most one Hamiltonian cycle.

This is a very strong claim for a graph, its structure should not be so complicated. For instance, it holds if every vertex of G has a degree at most 2. We can try it as a first idea, though unfortunately this is not true. Trying to prove the degrees are 2 or less, we, of course, are doomed, but we could establish that it’s “almost” true. Each vertex p is connected with at most two other vertices less than p. Here, by “less” we mean the natural order of the primes. This is the easier part – there are only two properties of G we need, this is the first one.

Lemma 1. Each vertex (prime number) p is connected with at most two primes less than p.

Proof. Suppose, a prime p>2 is connected to two primes q,r with p>q>r>2. Then there exist two positive integers x,y such that

pq=x^2+x+k\,;\, pr=y^2+y+k\qquad (1)

Subtracting the two equalities, we get


Since p is the biggest one among p,q,r, by (1) we obtain x,y<p, thus p cannot divide x-y, hence it divides x+y+1, but x+y+1<2p therefore x+y+1=p. I think, almost every contestant has discovered this fact. One step further lies the observation that p cannot be connected to another prime (apart from q,r) that is less than p. Indeed, if s is a prime s<p and ps=z^2+z+k the same argument yields x+z+1=p therefore z=y and s=r. \blacksquare

What we have established till now is that G is (so called) two-degenerate graph. A graph is two-degenerate if its vertices can be enumerated as v_1,v_2,\dots such that v_{j} is connected to at most two vertices among v_1,v_2,\dots,v_{j-1}.
Unfortunately, this property is not enough to prove the uniqueness of Hamiltonian cycles. Bellow is a picture that presents a two degenerate graph with two Hamiltonian cycles.

Fig. 1

Indeed, the vertices can be ordered like it’s shown: 1,2,3,4,5, and every next vertex is connected with at most two of the previous ones. However, there are two Hamiltonian cycles: 1,2,3,5,4,1 and 1,3,5,4,2,1.

The second property of G is harder to establish though there are some hints. (I didn’t go further then the two-degenerate property). Look at the fig.1 again. The moment G breaches the uniqueness-of-Hamiltonian-cycle property is the moment we add the vertex 5. Assume, instead of connecting 5 to 3 and 4, we connect it to 2 and 3

Fig. 2

This graph still has a unique Hamiltonian cycle. Indeed, the vertex 5, as the last one, has degree 2, so if there are two Hamiltonian cycles, both of them pass through 5 like \dots,2,5,3,\dots and since 2 and 3 are connected, there are also two Hamiltonian cycles in the induced subgraph on 1,2,3,4. Which is false. The same argument can be used (as shown later) to establish Hamiltonian uniqueness in case every time we add a vertex and it’s connected to two of the previous ones, these two vertices are also connected.
At this moment one may say – wait, what if we connect the vertices 3 and 4 on fig. 1? We still get two Hamiltonian cycles!? The point is, if we do so, the graph would not be two-degenerate any more.

Fig. 3

The graph induced on 1,2,3,4 is not two-degenerate graph. It can’t be, since all its vertices have degree 3 and the last one added will be connected to three of the previous vertices.
The next claim is the toughest NT part of this problem and is the key property G satisfies. I have read in some forums that the problem was not difficult, etc. Had the next sentence was given as a hint it would have been easy indeed. Because, if you know what to prove, it’s straightforward at that level. The point is that it’s not that intuitive.

Lemma 2. Let p>2 be a prime number, connected to q and r\,,\, p>q>r>2. Then q and r are also connected.

Proof. Let pq=x^2+x+k, pr=y^2+y+k. As we showed p>x>y and

\displaystyle p(q-r)=(x-y)(x+y+1).

It implies p=x+y+1. Let us set x=p-s, y=s-1 where 1<s<p. Further,

pq=p^2-2ps+p+s^2-s+k\,;\, pr=s^2-s+k\qquad (2)

Hence, p\mid s^2-s+k and let s^2-s+k=up. Multiplying the equalities in (2) we get

qr=u(p-2s+1+u)=u^2-2us+u+up\qquad (3)

Putting back in (3) up=s^2-s+k we obtain

qr=(u-s)^2 +(u-s) +k

Thus, we showed qr=z^2+z+k where z\in\mathbb{Z}. Here, we have a “small” issue since it may happen z\le 0. If one trace back what u and s were, one can it’s mostly the case. Fortunately, if we can represent qr as qr=z^2-z+k where z> 0 we can represent it also as qr=(z-1)^2+(z-1)+k. \blacksquare

But, we have still an issue! What if z=0 or z=1. That is, what we proved is that qr is representable as z^2+z+k where z\geq 0. It’s slightly annoying since q and r are connected if and only if qr is representable as z^2+z+1 and z\in \mathbb{Z}^+. We cannot do anything, but to change the statement of the problem. Allow the vertices p,q be connected even if pq=0^2+0+k=k. It cannot change anything, since if we prove the claim for a graph obtained by adding a few more edges, it will be also true for the original graph. That said, let’s recap. We proved the graph G has the following two properties.

  • (i) The vertices of G can be ordered v_1,v_2,\dots such that v_j, j\ge 2 is connected with at most two among the vertices v_1,v_2,\dots,v_{j-1}, and
  • (ii) If v_j is connected to v_{\ell} and v_{m}, \ell, m<j then v_{\ell} and v_m are also connected.

This is enough to prove that there does not exist a subgraph H of G with two distinct Hamiltonian cycles. The argument is simple. Suppose there is a subgraph H like that and let the vertex v_m be the first one so that H appears after adding it. That is, there is no subgraph like that in the graph G' induced by v_1,v_2,\dots,v_{m-1} but there is one, say H, in the graph G'' induced by v_1,v_2,\dots,v_{m-1},v_m. Apparently, v_m\in H since it’s the first time H appears. It also means v_m is connected to exactly two vertices v_i,v_j, i,j<m and v_i,v_j\in H. So, we have at least two Hamiltonian cycles in H and both pass trough v_m like \dots v_i\, v_m\,v_j\,\dots. Since v_i and v_j are connected, by (ii) it yields there are also two Hamiltonian cycles in H-v_m, contradiction.

[1] IMO 2022, Problem 2.
[2] IMO 2022, Problem 6.

IMO Shortlist 2021, A1

A pleasant though not difficult problem from the 2021 IMO shortlist.

Problem (IMO SL 2021,A1 ). Let n be a positive integer and A be a subset of \{0,1,2,\dots, 5^n\} consisting of 4n+2 numbers. Prove that there exist a,b,c\in A such that a<b<c and c+2a>3b.

Solution. Assume on the contrary, c+2a\le 3b for any elements a<b<c from A. It means

c-a\ge \frac{3}{2} (c-b)\qquad (1)

Let the elements of A be a_1>a_2>\dots>a_m; m:=4n+2. Applying (1) consecutively for (a_1,a_2,a_3)\,;\, (a_1,a_3,a_4)\,;\, \dots, (a_1,a_{m-1},a_m) we get

\displaystyle a_k-a_1\ge \left(\frac{3}{2}\right)^{k-2}\cdot (a_2-a_1); k=3,4,\dots,m

The above inequality and a_2-a_1\ge 1 implies

\displaystyle a_m-a_1\ge \left(\frac{3}{2}\right)^{m-2}=\left(\frac{3}{2}\right)^{4n}.

Since \left(\frac{3}{2}\right)^4> 5 we get


which contradicts the condition A\subset \{0,1,\dots,5^n\}.

IMO 2021 Shortlist, C5.

The problem I am going to present was on the IMO 2021 shortlist, but it was not chosen as IMO problem, which is sad. This is a briliant problem, because it seams so easy, so easy, untill a moment later you realize it’s not!
I intend to present a solution to a slightly easier version of this problem and then reduce the original one to it.

Problem 1 (C5, IMO 2021 SL). Let n and k be two integers with n>k\geq 1. There are 2n+1 students standing in a circle. Each student S has 2k neighbors – namely, the k students closest to A on the left, and the k students closest to A on the right.
Suppose that n+1 of the students are girls, and the other n are boys. Prove that there is a girl with at least k girls among her neighbors.

So, let’s put 1 on each girl’s position and -1 on the boys’ positions. Thus, we have +1 and -1‘s placed on the circle with a positive sum, and want to locate a number equal to $+1$ such that the sum of all its neighbors up to the k-th position to the left and right and the number itself is positive. One may think: what if we take only the neighbors on the left (or right). Let’s discuss this problem first.

Problem 2. We have n, n\in\mathbb{N} numbers +1,0 or -1 written on a circle with positive total sum. Prove that there exists a number equal to +1, such that the sum of all its k, 0\le k\le n-1 neighbors on the left and the number itself is positive.

Proof. Enumerate the numbers anti-clockwise as x_1,x_2,\dots, x_n. Consider the sequence of their partial sums

\displaystyle s_m:=\sum_{i=1}^m x_i\,,\, m=1,2,\dots

where we assume a_{i+n}=a_i. Suppose, seeking contradiction, that the claim is not true. It means that for any x_m=1 we have x_m+a_{m-1}+\dots+x_{m-k}\le 0, that is, s_m\le s_{m-k-1}. Let’s focus on the behavier of the sequence \big( s_m\big)_{m=1}^{\infty}. At each next step, this sequence may jump one unit up, stay the same or decrease one unit. It converges to +\infty since \sum_{i=1}^n {x_i}>0. We also know that the following property holds

  • (i) For any m\ge k+2, whenever s_m jumps up one unit (that’s, when x_m=1) we have s_m\le s_{m-k-1}.

The point is that (i) rules out the possibility \displaystyle \lim_{m\to\infty} s_m=\infty. Indeed, if this were the case, then for any positive integer N larger than s_1,s_2,\dots,s_{k+2} there is a moment m such that s_m\ge N for the first time. But the property (i) prevents this to happen. This leads to contradiction, hence we established the claim of Problem 2. \blacksquare

Let’s now go back to the original Problem 1. As done above, place +1 and -1 instead of boy respectively girl. Denote the given numbers as a_1,a_2,\dots,a_n and let x_m:=a_m+a_{m-k-1}, m=1,2,\dots,n where a_{-i} means a_{n-i}. Note that x_m can take values 2,0,-2 and \sum_{i=1}^n x_i\ge 2.
By Problem 2 there exists x_m>0 such that s:=x_m+x_{m+1}+\dots+x_{m+k} is positive. This means two things. First, x_m must be 2 which means a_m=a_{m-k-1}=1. Second, s is at least 2 and then \displaystyle \sum_{i=-k}^k a_{m+i}=s-a_{m-k-1}\ge 1. We are done.

IMO 2021 Shortlist, C6. A Hunter and a Rabbit Again.

Do you remember problem 3 of IMO 2017? A wonderful problem! I glanced again on the results of that year and still cannot understand why they were so disastrous! The knowledge needed to solve it is just the Pythagoras theorem or an entry level of trigonometry. So, it was not about technique. Absolutely everyone at that level can make those calculations, and still only 1 student had something like 5 points. I am notified every time a guy posts a solution on AoPS thread, and sometimes I take a glance. Most of them are like: “…the best strategy of the hunter is bla-bla, so he cannot discern whether the rabbit is here or there…” and then a bunch of calculations follows that shows something could increase. The emphasize is always on the calculations! I dare say had the same problem was formulated in another way it could have been completely solved by at least 75% of the contestants. Just changing the wording could make the difference! But, this is a story for another post (maybe the next one). Oh, that problem was a gem!

There is another problem on 2021 IMO’s shortlist worded for a hunter and a rabbit. It was listed as C6 but I think, it would have been solved by many students if it had been chosen as IMO 2021 problem. This problem is more of a technique than any out-of-the-box idea. Let us recall the statement.

Problem (IMO 2022 SL,C6). A hunter and an invisible rabbit play a game on an infinite square grid. First, the hunter fixes coloring of the cells with finitely many colors. The rabbit then secretly chooses a cell to start in. Every minute, the rabbit reports the color of its current cell, and then secretly moves to an adjacent cell that is not visited before (two cells are adjacent if they share a side). The hunter wins after some finite time either

  • the rabbit cannot move; or
  • the hunter can determine the cell in which the rabbit started.

Decide whether there exists a winning strategy for the hunter.


The hunter can color the cells in a way that enables him to guess the starting point of the rabbit, providing the rabbit does not end up in a dead end.
We consider the colors as consisting of some layers, i.e. any color will be viewed as a tuple (c_1,c_2,c_3) where each variable (layer) c_i,i=1,2,3 can take a finite number of colors. Each layer has a purpose. The first one is for orientation – to know the directions of each move the rabbit makes (up,down,right, left). The second layer will ensure there are no two paths with identical sequence of colors that start from different cells within a fixed frame around the origin. The third layer is to give us the information about the region (frame) within which the rabbit starts.
This structure of colors is only for convenience, clearly this configuration can be encoded with one dimensional colors. The first layer c_1 takes values 1,2,3,4,5 and its purpose is to control the direction of the rabbit. We put these values on each row starting as 1,2,3,4,5,1,\dots and on the next row starting from 3, on the next one – from 5, then from 2,4,1 and so on. This coloring allows us to translate the sequence of colors into directions, say, up, left, down, down, right, etc. So, if we know the starting cell, this layer helps us to recover rabbit’s trajectory. It also means that two equal sequences of the first layer color correspond to trajectories such that one of them can be obtained by translating the other with the vector determined by their starting points. The second layer of colors will allow the hunter to locate the starting position of the rabbit, providing he knows the rabbit started in a fixed frame. The second layer will consist of only two colors.
Suppose the hunter knows the approximate location the rabbits started from, i.e. he knows that the rabbit started in a cell inside some (big) square (frame), say Q centered at the origin. Suppose all the cells inside a bigger square Q'\supset Q (also centered at the origin) have their second layer already colored. We claim that the hunter can consecutively color the second layer of each cell in a bigger square Q''\supset Q' in a way that there are not two paths starting from different cells in Q and leaving Q'' that have the same sequence of colors.
Let us fix two cells q_1,q_2\in Q, q_1\ne q_2 and v be the vector q_2-q_1. For any cell q' outside Q', but having a common side with Q' we consider the cells r:=q'+ v, s:=q'-v. They cannot be both in Q' (then q’ would have been inside Q' since Q' is convex). So, in case one of them is outside Q', say r, and s is inside this square, we set the second layer of q' to differ the second layer of s, and the second layer of r be thae same as that of s, that is, we color them alternatively. If both r,s are outside Q' we also color the second layers of r,q',s alternatively. We do this procedure for all q' outside Q' but with a common side with Q'.
Suppose now, there exists two paths, say, P_1,P_2 that start from q_1 and q_2 respectively, and with the same sequences of colors. Suppose, q' is the first cell that one of them hits outside the frame Q'. At this moment the other path hits one of the cells q'\pm v. But, we took care q' and q'\pm v have different second layer color. Thus, there are no paths starting from q_1 and q_2 respectively, that leave Q' and have the same sequence of colors. Further, we color somehow the second layer of those cells that have a common side with Q' and are still uncolored, and proceed with some other pair of cells q_1,q_2\in Q and with one unit bigger than Q' frame which we again denote as Q'. In the end, we expand the coloring to a frame Q'' and are sure that there are no two paths that start inside Q, leave Q'' and have the same sequence of colors.

It remains to invent some sign that would show us the approximate location the rabbit started from. That’s the third layer that comes to help. This layer consists of three colors, say, white, black and neutral. Place a strip, two cells wide, symmetrical to the origin, and color the third layer of its inner cells white and its outer cells black. Thus, we will know if the rabbit starts within this strip – it will cross the border as white, black pattern. Now, we set infinitely many concentric strips like that of rapidly increasing sizes. It’s enough to ensure that the shortest path between two consecutive strips is longer than the longest path between the previous strips. We color the third layer of all other cells neutrally. In this way, examining the sequence ot the third layer’s color, we can determine the frame from which the rabbit started.

So, the number of needed colors is 5\cdot 2\cdot 3=30. But looking closely to the third layer one may see that in order to determine the frame the rabbit starts in, we actually only need strips one cell wide, colored, say, white, and all the other cells colored neutrally. Hence we can decrease the number of total colors to 5\cdot 2\cdot 2=20.

Three Graph Problems on IMO 2022?! Problem 6.

The previous post, commented on problem 2 of IMO 2022 – a non standard nice functional equation. A graph theory approach was used although it served only for a visualization and convenience. Here, we will comment on Problem 6, which is clearly a graph theory problem. Thanks to Aleksandar Ivanov, some ideas came up in the discussion we had.
We will see that all simple graphs that allow a minimum number of up-paths, are exactly the graphs whose vertices can be partitioned into two parts T and I such that I is an independent set and the graph induced by T is a tree. Let us first recall the problem.

Problem (IMO 2022, problem 6). Let n be a positive integer. A Nordic square is an n \times n board containing all the integers from 1 to n^2 so that each cell contains exactly one number. Two different cells are considered adjacent if they share a common side. Every cell that is adjacent only to cells containing larger numbers is called a valley. An uphill path is a sequence of one or more cells such that:

  • (i) the first cell in the sequence is a valley,
  • (ii) each subsequent cell in the sequence is adjacent to the previous cell, and
  • (iii) the numbers written in the cells in the sequence are in increasing order.

Find, as a function of n, the smallest possible total number of uphill paths in a Nordic square.
(Nikola Petrović)

Consider the graph G(V,E) whose vertex set consists of all the cell of the board. Any two of them are connected only if they have a common side. We are allowed to choose the direction on each edge in E in such a way that G becomes an acyclic directed graph. We call a vertex v\in V a root (a valley) if there are only out-edges incident with v. We call a directed path that starts from a root, a up-path. The question asks for the minimum number of distinct up-paths.

Lower bound. Fix a directed edge e and start walking downward until reach a root. It is not possible to fall into a cycle because the graph is acyclic. Thus, we found a up-path which last edge is e. Hence, the number of up-paths that consist of at least an edge is at least the number of edges |E|. Further, there is at least one root, and so, an up-path that consists of only a vertex. Therefore the number of up-paths are at least |E|+1. This is true for any simple graph.

An example and attempt to generalize it.

It is the harder part of the job. Let’s first analyze what directions we have to introduce, so that the number of up-paths equals to the minimum possible, i.e. |E|+1. Suppose we can make a configuration like that. There must be only one root, otherwise the up-paths would be more than needed. Further, starting from that root and following the directions, there do not exist two paths from the root to the same vertex, unless this vertex has only in-bounded edges. Let I be the set of all those vertices with only in-bounded edges. If we remove I, (and the incident edges) the remaining graph is a directed tree. Moreover, I must be independent set of vertices, because it’s impossible an edge between two vertices of I to exists.
The conclusion is: The graph G (the undirected one) admits a decomposition into a tree T and a set I of independent vertices. On the other hand, if we can do this, we can direct edges as needed. Indeed, take a root of T and direct its edges from this root to the leaves. Next, direct the edges, that connect T and I, from T to I.

That’s, we can characterize all simple graphs that allow example of exactly |E|+1 up-paths. They are exactly those graphs whose vertices can be partitioned into two parts T and I, such that I is an independent set and the graph induced by T is a tree.

So, the question is: can we characterize graphs like that in a simpler way. For example, not all simple graphs can be partitioned into a tree and independent set. Note that the original graph is bipartite. I am convinced that there are bipartite graphs that cannot be decomposed like that. So, apparently, we cannot generalize the problem for, say, all graphs, or bipartite graphs.
If one tries to find a greedy algorithm that consecutively puts vertices into T or I one can notice that there is a big chance 2-degenerate graphs to allow partitioning like that.

Definition. We call a simple graph kdegenerate k\in\mathbb{N} if its vertices can be ordered v_1,v_2,\dots,v_n such that every vertex v_j has at most two neighbors among v_1,v_2,\dots,v_{j-1}.

Still, I have obstacles to prove any 2-degenerate graph has the needed partition. Most probably, this is not true. But, if we impose a further restrictions, it will be true.
Consider a graph G whose vertices V can be ordered as v_1,v_2,\dots, v_n with the following properties.

  1. For each j=1,2,\dots,m; m\le n the vertex v_j has only one neighbour among the previous ones v_1,v_2,\dots,v_{j-1},
  2. For each j=m+1,\dots,n all neighbours of v_j among the previous vertices v_1,v_2,\dots,v_{j-1} are either all among v_1,v_2,\dots,v_m or they are at most two, and one of them is among v_1,v_2,\dots,v_m

If a simple graph G satisfies (1) and (2) then a greedy algorithm can partitioned it into a tree and an independent vertex-set.
We put the first m vertices into the set T. It is indeed a tree so far. For any next vertex v_j, if it is connected (up to this moment) to only vertices among v_1,v_2,\dots,v_m we put v_j into I. Otherwise, v_j is connected to at most two previous vertices and one of them is in T. If the other one (if any) is also in T we put v_j in I. If the other one is in I, or does not exist, we put v_j into T. Clearly, T is still a tree and I – still independent set.

Three Graph Problems on this IMO?!

This year’s International Mathematical Olympiad has ended. The results are known, and I am happy that the Bulgarian team was 16th and won 5 medals and a honorable mention – 1 gold, 3 silver and a bronze. Congratulations to the entire team and their leaders.
Let me first comment on a nice functional equation from this Olympiad, it was problem 2. It’s not like the routine ones – like, you have a function that satisfies an equality on a certain domain. Here, you have a function that satisfies a certain constraint, and that constraint is more interesting than an equality. Therefore, it needs a fresh ideas, not just putting values back and forth, hoping to find some values and unravel the case.
Of course, there is no need to involve any graph theory in this problem, but it helped me visualize the situation more clearly, so I decided to keep the graph theory language in the explanation that follows.

Problem (IMO 2022, Problem 2). Let \mathbb{R}^+ denote the set of positive real numbers. Find all functions f: \mathbb{R}^+ \to \mathbb{R}^+ such that for each x \in \mathbb{R}^+, there is exactly one y \in \mathbb{R}^+ satisfying

\displaystyle xf(y)+yf(x) \leq 2.

Before presenting a solution, let me remark that the requirements of the statement can be weakened keeping the same result. It’s enough to impose that for each x \in \mathbb{R}^+, there is at least one, but at most countably many y \in \mathbb{R}^+ satisfying the inequality. The result will be the same – see at the end of the post.

Solution. Let us denote F(x,y):=xf(y)+yf(x). Consider a graph G(V,E) with vertices V=\mathbb{R}^+ and edges E defined as e=(x,y)\in E iff F(x,y)\le 2. It may happen x be connected to x, so the graph may contain loops. The imposed requirements imply d(v)=1,\forall v\in V, that is, we have a perfect matching (with the convention that a matching x to x is allowed). For convenience, for any edge e=(x,y)\in E we write F(e):=F(x,y).

Claim. Let E'\subset E be the subset of edges defined as

\displaystyle E':= \{e\in E : F(e)<2\}

Then E' is at most countable.

Proof. In order to prove the claim it is enough to prove that the set

\displaystyle E_{\varepsilon}:= \{e=(x,y)\in E : F(e)\le 2-\varepsilon; x,y\ge \varepsilon\}

is at most countable for any \varepsilon>0. Assume on the contrary E_{\varepsilon} is uncountable. Consider a subset A of \mathbb{R}_+^4 defined as

\displaystyle A:=\{(x,y,z,w): (x,y)\in E_{\varepsilon}, z=f(x),w=f(y)\}.

This is an uncountable subset of \mathbb{R}^4, hence there is a point \mathbf{u}_0=(x_0,y_0,z_0,w_0)\in \mathbb{R}^4 such that in each neighborhood of \mathbf{u}_0 there are infinitely many points of A. Note that x_0,y_0\ge \varepsilon. For each of these points either x\ne x_0 or y\ne y_0, so without lost of generality we can assume each neighborhood of \mathbf{u}_0 contains infinitely many points of A with y\ne y_0. For any \mathbf{u}=(x,y,z,w)\in A we have

\displaystyle |F(x,y)-x_0w_0-y_0z_0|=|xw+yz-x_0w_0-y_0z_0|\qquad(1)

Since xw+yz is a continuous function on the variables x,y,z,w, there exists a neighbourhood U of \mathbf{u}_0 such that

\displaystyle |F(x,y)-x_0w_0-y_0z_0|\le \frac{\varepsilon}{4}\,,\, \forall (x,y,z,w)\in U\cap E_{\varepsilon}

It means that taking any (x,y,z,w) and (x',y',z',w')\in U\cap E_{\varepsilon} we get

\displaystyle |F(x,y)-F(x,y')|\le \frac{\varepsilon}{2} \qquad (2)

Thus, F(x,y)<2 and F(x,y')<2 which means both (x,y) and (x,y') are edges in E' which is contradiction, since we can take y\neq y'. \blacksquare

So, we have proven there are at most countably many (x,y)\in \mathbb{R}_+^2 with F(x,y)<2. It means F(x,x)\ge 2 on a set X that contains all but countably many positive reals. Further,

\displaystyle F(x,x)=2xf(x)\ge 2, \forall x\in X.

yields \displaystyle f(x)\ge \frac{1}{x}, \forall x\in X. Therefore, for all but countably many edges (x,y)\in E we have

\displaystyle F(e)=xf(y)+yf(x)\ge \frac{x}{y}+\frac{y}{x}\ge 2\qquad(3)

where we used AM-GM inequality. Taking into account F(e)=2, there must be equality everywhere in the chain of (3), hence x=y for all but countably many (x,y)\in E. Thus,

\displaystyle f(x)=\frac{1}{x},\forall x\in X\qquad(4)

where \mathbb{R}\setminus X is at most countable. Since X is everywhere dense in \mathbb{R}_+, it easily follows (4) holds for all x\in\mathbb{R}_+.


As said, we can weaken the requirement and ask for each x\in\mathbb{R}^+ to exist at most countably many y\in\mathbb{R}^+ that satisfy the inequality. The only function that complies with this condition will be still f(x)=\frac{1}{x}. In this case, the edges incident with any vertex of the graph we constructed, are countably many. claim we proved. The claim, we proved still holds, but its proof needs to be changed a bit. Namely, there exists \mathbf{u}_0=(x_0,y_0,z_0,w_0)\in \mathbb{R}^4 such that in each neighborhood of \mathbf{u}_0 there are uncountable many points of A. The rest is the same.

Three graph problems on IMO 2022?! Problem 6 >>

IMO 2021 Shortlist, C8.

In the previous blog post, we considered a general idea that is applicable in combinatorics. In particular, a hard problem from 2007 Bulgarian Olympiad was solved. Here, the same method will be applied to a problem from IMO 2021 shortlist.
Consider permutations of 1,2,\dots,m. One can think of them as functions [1..m]\to [1..m] and define a distance between two permutations as the maximum pointwise distance between their values. The question is: How many permutation we can choose, such that the distance between any two of them is at least 2 ?

Problem (SL 2021, C8). Determine the largest N for which exists a table T of integers of N rows and 100 columns that has the following property:

  • (i) Every row contains the numbers 1,2,\dots,100 in some order.
  • (ii) For any two distinct rows r and s there is a column c such that |T(r,c)-T(s,c)|\ge 2.
    (here T(r,c) means the number at the intersection of the row r and the column c )


Look on each row n as a function f_n: [1..100]\to [1..100] which has an additional property of being a permutation. Let F be the family of all possible functions (permutations) like that. For any two f,g\in F we can define a non negative number d(f,g) , call it the distance between them as

\displaystyle d(f,g):= \max_{k\in [1..100]} |f(k)-g(k)|

This distance satisfies all the properties of being a “distance” as for instance the triangle inequality, etc. Denote by B(f) the unit ball centered at f, i.e.

\displaystyle B(f):= \{g : g\in F, d(f,g)\le 1\}

Upper bound. Let F_1\subset F be a collection of the given functions that satisfies (ii). It means that d(f,g)\ge 2, \forall f,g\in F_1, f\ne g. So, the unit ball centered at any f\in F_1 does not contain any other point (function) in F_1.
Let’s now fix f\in F and see what kind of functions B(f) consists of. Take into account that F consists of only permutations. We are interested of the values of f, see the picture.

Any g\in B(f) can be obtained from f applying a composition with \varphi which is a permutation and satisfies.

\displaystyle  \varphi(k) \in\begin{cases}\{k-1,k,k+1\}, & k=2,3,\dots,99\\ \{1,2\}, & k=1\\ \{99,100\}, & k=100\end{cases} \qquad(1)

That is,

\displaystyle B(f)=\{\varphi\circ f : \varphi \text{ is a permutation complying with (1)}\}

It’s easy to see that all permutations \varphi like that can be obtained by choosing some consecutive disjoint pairs (i,i+1), 1\le i\le 99 and “swap” their values (\varphi(i):=i+1, \varphi(i+1):=i) and keeping fixed the non paired numbers.

Knowing what the structure of B(f) is, back to the main point. A greedy algorithm of obtaining F_1 is to start with a function f_1, and at j-th step j=2,3,\dots to take f_j that is outside B(f_1)\cup\dots\cup B(f_{j-1}). The main difficulty in this approach is that two balls can have common points and it’s difficult to put some bounds on the maximum number of functions f_1,f_2\dots we can construct with this procedure. Fortunately, each ball contains a subset, call it a “kernel”, and all those kernels are disjoint!
Let’s partition the numbers from 1 to 100 into 50 pairs \{2i-1,2i\}, i=1,2,\dots,50 and \Phi'\subset \Phi be the subset of those permutations which act on each pair 2i-1,2i,i=1,2,\dots,50 either as swapping 2i-1,2i or leaving it intact. Thus,

\displaystyle B'(f):= \{\varphi'\circ f : \varphi'\in \Phi'\}

is a subset of B(f) and the following claim holds

Claim. For any g\notin B(f) we have B'(f)\cap B'(g)=\emptyset.

Proof. suppose h\in B'(f)\cap B'(g). Then h=\varphi_1\circ f=\varphi_2\circ g which yields

\displaystyle g=\varphi_2^{-1}h=\varphi_2^{-1}\circ\varphi_1\circ f

and since \varphi_2^{-1}\circ\varphi_1\in \Phi' it follows \displaystyle g\in B'(f)\subset B(f), contradiction. \blacksquare

Now, suppose f_1,f_2,\dots,f_N are N functions/permutations [1..100]\to[1..100] with d(f_i,f_j)\ge 21, \forall 1\le i<j\le 100. It means B'(i), i=1,2,\dots,N are disjoint and since each of them consists of 2^{50} elements we get

N\cdot 2^{50}\le 100!

hence \displaystyle \boxed{N\le \frac{100!}{2^{50}}}.

A construction. It remains to show we construct a set T of N:=\frac{100!}{2^{50}} permutations like that. First, we consider the family of all partitions of the numbers 1,2,\dots,100 into 50 sets of two elements. The number of these partitions is 99\cdot 97\cdots 3\cdot 1. Indeed, the number 1 can be paired with 99 other numbers, then the least number among the remaining ones can be paired with 97 of the others, and so on. Further, we want to take into account the order of those 50 pairs by permuting them, that is, we want to put these two element sets in a row. In this way, we obtain 99!! \cdot 50!=\frac{100!}{2^{50}} of those objects. It remains to order each set \{i,j\} like (i,j) or (j,i) and each object will become a permutation i_1j_1 i_2j_2\dots i_{50}j_{50}.
The ordering of the pairs will be constructed inductively on the size 2m of the permutations. In our case 2m=100. For m=2 it can be made straightforward and checked that the distance between any two of the obtained permutations is at least 2.
Suppose now for some m\in \mathbb{N} we can order the pairs in any partition of 1,2,\dots,2m into sets of two elements, so that the distance between any two of the obtained permutations, obtained by permuting those pairings, is at least 2. Moreover, we assume all the two element sets \{1,\ell\}, \ell\in\{2,3,\dots,2m\} are ordered as (1,\ell).
Further, suppose the set \{1,2,\dots,2m+2\} is paired somehow and 1 is paired with \ell, 2\le \ell\le 2m+2. We order \{1,\ell\} as (1,\ell), then map the remaining 2m numbers to 1,2,\dots,2m conserving the order. At this point, we order the the remaining pairs the way it can be done for a pairing of 1,2,\dots,2m. Finally, we reverse their order.
Let’s see that for any two f,g of the permutations, we just constructed, it holds d(f,g)\ge 2. Assume (1,r) and (1,s) are pairs in f and g respectively. We first consider the case these two pairs are not at the same positions in f and g respectively. In this case (1,r) should be at the same position in f as the pair \{2,t\} in g, otherwise d(f,g)\ge 2 and we are done. But the pair \{2,t\} in g is ordered as (t,2) (we reversed the order in our inductive construction). Thus, in this case we certainly have d(f,g)\ge 2.
So, assume (1,r) in f is at the same position as (1,s) in g. If |r-s|\ge 2 we are done, and in case r=s the inductive hypothesis yields again d(f,g)\ge 2. Let now s=r+1. We remove the two pairs (1,r) and (1,s) from f and g respectively, thus obtaining new functions – f',g' that map [1..2m] to some values. The values of f' are 2,3,\dots,r-1, r+1,\dots,2m+2 and those of g' are 2,3,\dots,r-1,r, r+2,\dots,2m+2. Let f'' and g'' be the corresponding to f',g' functions which values are 2,3,\dots,2m+1 so that f' is obtained from f'' by lifting the values r,r+1,\dots,2m+1 one unit up and g' is obtained from g'' by lifting the values r+1,r+2,\dots,2m+1 by one unit up. We know, by the induction hypothesis, that d(f'',g'')\ge 2 and let j be the position where |f''(j)-g''(j)|\ge 2. Since r and r+1 are consecutive integers, either both values f''(j), g''(j) will be raised one unit up, or the greater value among f''(j), g''(j) will be raised up. In both cases |f'(j)-g'(j)|\ge 2, which means |f(j)-g(j)|\ge 2. This completes the induction step.