Graph Partitions Under Degree Constraints.

I nice graph theory problem came into my attention. It was given at the Ukraine NMO 2020. Its solution is extremely short, but the problem is difficult in my opinion. Especially if one has not seen the idea. This problem is in fact a Laszlo Lovasz‘s result, published in 1966 (see [1]). Though the idea behind it is instructive and deserves attention, I hope, it’s not a trend, famous theorems in graph theory to be transformed into Olympiad problems. Another such example is problem 5 of Bulgarian NMO 2020.
Here is an outline of this blog post. I first give the mentioned problem with a short solution. Then the Lovasz’s result [1] is commented, which is more general. Finally, a result of Borodin and Kostochka [3] is presented which is a further generalization. One may think that there is something involved in these generalizations, that makes them more difficult to grasp. Nothing like that! They are all done the same way – applying the same idea.

Problem (Ukraine NMO, 2020). Positive integers a, b, n are given, such that a + b = n-1 . In some school, each student has at most n friends from that school. Prove that all the students can be split into two groups A, B in such a way that each student in A has at most a friends among the students in A and each one in B – at most b friends among those in B.

I prefer the straightforward graph theory interpretation.

Problem 1. Let G be a given graph with maximum degree \Delta and d_1, d_2 be positive integers with d_1+d_2=\Delta -1. Prove that the set of vertices V can be partitioned into two subsets V_1,V_2 in such a way that the maximum degrees of the induced graphs on V_1 and V_2 are correspondingly at most d_1 and d_2.

Proof. For any partition of V into two sets V_1,V_2 denote by G_1, G_2 the graphs induced by V_1,V_2 and by |E(G_1)|, |E(G_2)| the number of edges in G_1, resp. G_2. For any such partition, consider the function

\displaystyle w(G_1, G_2):=\frac{|E(G_1)|}{d_1}+\frac{|E(G_2)|}{d_2}.

Let G_1(V_1), G_2(V_2) be a partition that minimizes w(\cdot,\cdot). We claim \Delta(G_1)\le d_1, \Delta(G_2)\le d_2 (where \Delta(G) means the maximum degree of G). Indeed, suppose there is a vertex (WLOG) v_1\in V_1 with d_{G_1}(v_1)>d_1. In this case, note that v_1 is connected with at most \Delta-d_{G_1}(v_1)\le d_2 vertices in V_2. Thus, \displaystyle \frac{d_{G_1}(v_1)}{d_1}>\frac{d_{G_2}(v_1)}{d_2}. Hence, if we move the vertex v_1 from V_1 to V_2, thus obtaining new partitions V'_1:=V_1\setminus \{v_1\}, V'_2:=V_2\cup \{v_2\}, then the expression w(G'_1, G'_2) will be less than w(G_1, G_2), which contradicts the minimality of w(G_1, G_2). \blacksquare

I posted the problem in AoPS forum [6], one can see a solution presented there. Note that the claim also holds when one of d_1,d_2 is zero. Assume d_1=0, hence d_2=\Delta-1. Take V_1 to be a maximal independent set in G. In this case, any vertex v\in V_2:=V\setminus V_1 is connected with at least one vertex in V_1, otherwise V_1\cup \{v\} would be larger independent set. So, d_{G_2}(v)\le \Delta-1 and V_1,V_2 is a desired partition.

Theorem 2 (Laszlo Lovasz 1966, [1]). Given a graph G with maximum degree \Delta and nonnegative integers d_1,d_2,\dots,d_k with d_1+d_2+\dots+d_k\ge \Delta -k +1. Prove that the set V of the vertices of G can be partitioned into sets V_1,V_2,\dots,V_k so that the subgraph G_i induced by V_i has maximum degree at most d_i\,,\, i=1,2,\dots,k.

Note that in case k=2 we just get our Problem 1. Maybe, the easiest way to prove Theorem 2 is to induct on k, each time applying the same argument.
Another way is to consider the function (in case all d_i are positive)

\displaystyle w(G_1,G_2,\dots,G_k):= \sum_{i=1}^k \frac{|E(G_i)|}{d_i}

and take a partition that minimizes w. This claim is present as exercise 5.1.50, p.203 in the wonderful book of D. West, [5].

Theorem 3 (Borodin and Kostochka [3]). Let G be a graph and f_i : V(G)\to \mathbb{Z}_{\ge 1}, i=1,2,\dots,k be k functions, k\ge 2, such that

d_G(v)<f_1(v)+f_2(v)+\dots+f_k(v), \forall v\in V(G).

Then there is a partition V_1,V_2,\dots,V_k of V(G) inducing subgraphs G_i,i=1,2,\dots,k such that d_{G_i}(v)<f_i(v) for any i=1,2,\dots,k and any v\in V_i.

Proof (Lemma 2.1. in [4]). The case k>2 easily follows from the case k=2 by induction. For k=2 we consider the following function

\displaystyle w(G_1,G_2):= |E(G_1)|+|E(G_2)|+\sum_{v\in V_1}f_2(v)+\sum_{v\in V_2}f_1(v).

Let G_1,G_2 be a partition that minimizes w(G_1,G_2). The same argument as used in Problem 1 shows this partition has the desired property. \blacksquare

The Lovasz’s result (Theorem 2) is a special case of Theorem 3 when f_i(v)=d_i+1,i=1,2,\dots,k.
One may see similar results for digraphs in a paper of Noga Alon, see [7].

[1] L. Lovasz, On decomposition of graphs, Stud. Sci. Math. Hung. 1 (1966), 237–238.
[2] Problem 5 of Bulgarian NMO 2020.
[3] Borodin, O.V. and Kostochka, A.V. , On an upper bound of a graph’s chromatic number, depending on the graph’s degree and density. J. Combin. Theory Ser. B 23 (1977) 247–250.
[5] Douglas B. West, Introduction to Graph Theory, Pearson Education, Inc. , 2002, Second edition. (p.203, exercise 5.1.50)
[7] Noga Alon, Splitting Digraphs.

Hamel Bases, Part 3. Application on an Olympiad Problem.

In some previous posts (part 1, part 2) I commented on two Olympiad problems that can be attacked using Hamel basis notion. Moreover, it appeared to be the easiest way to solve them. For example, I don’t know another solution for the problem presented in part 2 (given at a Romanian TST 1994). Here, another such problem is considered. It was given at 1997 Iran third round NMO. I saw it in [1] along with the solution that follows.

Problem. (Iran III Round MO 1997). Let S = {x_0, x_1,\dots , x_n} be a finite set of numbers in the interval [0, 1] with x_0 = 0 and x_1 = 1. We consider pairwise distances between numbers in S. If every distance that appears, except the distance 1, occurs at least twice, prove that all the x_i are rational.

Suppose, for the sake of contradiction, not all x_i are rational. Let B be a Hamel basis of \mathbb{R} over \mathbb{Q}. Thus, any real number x is uniquely represented as

\displaystyle x=\sum_{b\in B} \lambda_b b

where \lambda_b\in\mathbb{Q} and the summands for each x are finitely many. Hence

\displaystyle x_j=\sum_{i\in I} \lambda_{j,i} b_i\,,\, j=0,1,\dots,n

where I is a finite set of indices and b_i\in B,\forall i\in I. Because of the assumption, there exists k\in I for which b_k\notin \mathbb{Q} and at least one coefficient \lambda_{j,k}\neq 0. Denote \Lambda_k:= \{\lambda_{j,k}\mid j=0,1,\dots,n\}. Note that 0\in \Lambda_k since x_0=0 and x_1=1 implies \lambda_{o,k}=\lambda_{1,k}=0. Let m:=\min\{\Lambda_k\}, M:=\max\{\Lambda_k\}. It’s impossible m=M since then we would have m=M=0 and \Lambda_k=\{0\}. Hence, m<M. Let X_M:=\{x_j \mid  \lambda_{j,k}=M,0\le j\le n \} and X_m:=\{x_j \mid \lambda_{j,k}=m, 0\le j\le n \}. What follows is the final observation.
Take any x\in X_M, y\in X_m. Then from the initial condition (since at least one of x,y is not rational) there exist x_r,x_s, 0\le r,s\le n with x-y=x_r-x_s, \{x,y\}\neq \{x_r,x_s\}. It means M-m=\lambda_{r,k}-\lambda_{s,k} which yields \lambda_{r,k}=M, \lambda_{s,k}=m, that’s, x_r\in X_M, x_s\in X_m.
Now, it remains to take x:=\max \{X_M\}, y:=\min \{X_m\} and get contradiction since then it imposes x_r=x, x_s=y and \{x,y\}=\{x_r,x_s\}.

<< To part 1
<< To part 2


Traps in the Probabilistic Method. Part 3.

We continue the theme of the two previous blog posts (part 1 and part 2) about some of the pitfalls that the probabilistic method could set for us. The reviewed problem was given at USA TSTST 2018, it was problem 9. At the end a non probabilistic solution, I came up with, is described.

Problem 9. (USA TSTST 2018) Show that there is an absolute constant c < 1 with the following property: whenever \mathcal P is a polygon with area 1 in the plane, one can translate it by a distance of \frac{1}{100} in some direction to obtain a polygon \mathcal Q, for which the intersection of the interiors of \mathcal P and \mathcal Q has total area at most c.
Author: Linus Hamilton.

Here I cite a solution provided in a popular forum, not the entire one, but just its beginning. It is not wrong or even inaccurate. However, the argument in the red text is not sufficient to reach such a conclusion.

Suppose \mathcal{P} is a polygon of area 1, and \varepsilon > 0 is a constant, such that for any translate \mathcal{Q} = \mathcal{P} + v, where v has length exactly \frac{1}{100}, the intersection of \mathcal{P} and \mathcal{Q} has area at least 1 - \varepsilon. The problem asks us to prove a lower bound on \varepsilon [non depending on \mathcal{P}].

Lemma: Fix a sequence of n vectors v_1, v_2, \dots, v_n, each of length \frac{1}{100}. A grasshopper starts at a random point x of \mathcal{P}, and makes n jumps to x + v_1 + \dots + v_n. Then it remains in \mathcal{P} with probability at least 1 - n\varepsilon.

Proof. In order for the grasshopper to leave \mathcal{P} at step i, the grasshopper’s position before step i must be inside the difference set \mathcal{P} \setminus (\mathcal{P} - v_i). Since this difference set has area at most \varepsilon, the probability the grasshopper leaves \mathcal{P} at step i is at most \varepsilon. Summing over the n steps, the probability that the grasshopper ever manages to leave \mathcal{P} is at most n \varepsilon.

Before commenting on it, let me make an overture. It’s a very loose example, just to convey the idea. Suppose you have a sequence of events E_1, E_2,\dots,E_n of equal probabilities to happen. Thus, the probability to happen E_1 or E_i, 1<i\le n is the same. Now, suppose we make somehow each event E_i, 1<i\le n to depend on the previous one E_{i-1}. So, there is no longer guarantee the probability of E_i is the same as of E_1.
Back to the presented solution. Let T_{v_i}, i=1,2,\dots,n denotes the translation by vector v_i. Let E_i be the probability a random point x\in \mathcal{P} leaves \mathcal{P} after applying T_{v_i}. Then, indeed, each E_i has a “small” probability \varepsilon to happen. But in our case these translations are applied one after another, i.e. in fact our translations are T_{v_1}, T_{v_2}\circ T_{v_1},\dots, T_{v_n}\circ\dots\circ T_{v_1}. Thus, the result of the next translation depends on the previous one. Luckily, the same result still holds, but the only reason preventing it from failing is that the translations are “good” transformations – they preserve areas. That’s, apart from that argument in red, another geometrical property of translations is crucial. Let’s see how it fails when we deal with a general transformation T.

Consider a transformation T: \mathbb{R}^2\to \mathbb{R}^2. Suppose that only a small area S, S\subset P leaves P under that transformation, with |S|/|P|=\varepsilon. It indeed means that the probability a point x\in P leaves P under T is \varepsilon.
But it doesn’t mean that the corresponding probability that it happens at the i th step is also \varepsilon. Indeed, suppose there is a big area S'\subset P such that S=T(S'). Then T(T(S')) is outside P at the second step and thus the probability T\circ T(x) leaves \mathcal{P} could be as close to 1 as we want. (providing T is not so good as the translation is).
Thus, all that argument strongly depends on the transformation T (at least, it should preserve the area), and that probabilistic arguments should use that fact, thou implicitly. What happened in fact is that we first proved that obvious claim pure geometrically in our minds, and after that it was dressed in a probabilistic way.

A non probabilistic approach.

Let D be the unit disk. We want to show there is a translation v\in D, such that (P+v)\cap P\leq 1-\varepsilon , for some absolute constant \varepsilon. Suppose on the contrary it’s not true.
Consider the set P\times D := \{(x,v)\mid x\in P, v\in D\}. Let P'\subset P\times D be a shape defined as: (x,v)\in P' iff x+v\in P. In fact the dimension of P' is 4, but this is not important. Now, for any fixed x_0\in P, for the “vertical” slit P'_{x_0}:=\{(x_0, v)\mid (x_0,v)\in P'\} , we have m(P'_{x_0})\leq 1, where m(\cdot) is the measure (in this case area) of P'_{x_0}. It means

\displaystyle m(P')\leq \sup_{x_0\in P} m(P'_{x_0})\cdot m(P)\leq 1.

On the other hand, consider the “horizontal” slit P'_{v_0}:= \{(x,v_0)\mid (x,v_0)\in P'\}. We have m(P'_{v_0})\geq 1-\varepsilon. It yields

\displaystyle m(P')\geq \inf_{v_0\in D} m(P'_{v_0})\cdot m(D)\geq (1-\varepsilon)\pi.

Combining the above inequalities, we get

(1-\varepsilon)\pi\leq 1.

But, for small \varepsilon>0 it fails, thus a contradiction. It is enough to take any \displaystyle \varepsilon<1-\frac{1}{\pi}.

<< To the first part
<< To the second part

Traps in the Probabilistic Method. Part 2.

Probability is a slippery matter. Sometimes human intuition and experience lead to to wrong decisions. That’s why, dealing with probability one should always keep in mind what the sample space is, what the family of events is, how the random variables are defined, etc. In a previous post we considered a graph theory problem, approached via the probabilistic method. We saw how forgetting for a moment what is the sample space of the possible outcomes brought us to a wrong conclusion. Another graph problem is reviewed here. It’s taken from a popular handout on the probabilistic method (I would recommend it). The linearity of expectation of certain random variables is the key point. But, one should remember that a random variable is not just a vague quantity that we take to vary randomly. It has a clear definition that needs to be known to avoid inaccuracies, as below.

Problem. Prove that any subgraph of K_{n,n} with at least n^2-n+1 edges has a perfect matching.

It’s an easy problem. Hall’s marriage condition kills it instantly. It’s included just as an example on linearity of expectation. Let’s see the solution presented in the paper.

Solution. So let’s be really careless and just randomly pair off one set of points with the other, regardless of whether there is actually an edge present [green pairs on the picture]. We call the score of such a pairing the number of pairs which are actually connected by an edge. We wish to show that some pairing has score n, as this will be the desired perfect matching.
So what’s the expected value of a random pairing? Number the pairs 1, 2,\dots, n and define

\displaystyle X_i:=\begin{cases}1 &\text{ if the }i \text{th pair is connected by an edge,}\\0 &\text{ otherwise.}\end{cases}

Then the score of the configuration is X = X_1 + X_2 +\dots + X_n. Given any red point and any blue point, the probability they are connected by an edge is at least \displaystyle \frac{n^2-n+1}{n^2} . This means that \displaystyle E[X_i] = \frac{n^2-n+1}{n^2} , so

\displaystyle E[X] = E[X_1] + \dots+ E[X_n]=
\displaystyle = n\cdot E[X_1] =\frac{n^2-n+1}{n}=n-1+\frac{1}{n}.

Since X takes only integer values, there must be some configuration which achieves X = n. Thus, we’re done.

So, what’s wrong (or rather, incorrect) with this argument? It’s the definition of the random variables X_1,X_2,\dots,X_n.
Ok, one can randomly pair off the set of points, it’s ok. The sample space of outcomes consists of all possible pairings “blue – red” vertices, each pairing taken with equal probability. That’s, each outcome is a set of n disjoint pairs. I underlined on purpose the word “set”, because there is no order in the set. So, one cannot just say “Number the pairs 1,2,\dots, n“. The way of enumeration does matter! It should be explicitly pointed out how to enumerate a set. For example, if you enumerate the pairs by the corresponding blue vertex each of them is incident with, then you cannot guarantee that the probability P(X_1=1) is at least \displaystyle \frac{n^2-n+1}{n^2}. Indeed, suppose the top blue vertex is connected with only one red one (all the others edges are present). So, in this case \displaystyle P(X_1=1)=\frac{1}{n}.
Thus, the definition of the random variables is not correct, because different enumerations may lead to absolutely different r.v. Moreover, the most natural enumeration – by the blue vertex of the corresponding pair – doesn’t do the job, E[X_1] could be whatever we want!

Fortunately, this solution can be fixed. One way to do it is to say we take randomly the ensemble of the pairs and enumerate them randomly 1,2,\dots,n (another randomness). Maybe, a clearer way is to enumerate them by the blue vertices they are incident with, but this time P(X_i=1) would have different values, depending on the degree of the corresponding blue vertex. But, it doesn’t matter much, since if P(X_i=1) is small (as above) the other ones would be big. So, a rapid calculation shows the sum E[X_1] + \dots+ E[X_n] would be the same as above: \displaystyle n-1+\frac{1}{n}.
Concluding the blog post, I am far from thinking the author was not aware of everything I just said. No, he had merely been a bit careless at thаt moment. That’s it.

<<To the first part To the third part>>

IMO 2020, Problem 4. Chains and Antichains.

We interpret this problem in terms of certain poset under two different partial orders and as an application of Dilworth’s theorem. Perhaps it’s a bit too much since the order of the stations, presented by the linkages each company provides (as defined in the statement), is a simple one – there are no forks in it. But, anyway, it’s interesting to consider how the statement could be generalized in some way.

Problem 4 (IMO 2020). There is an integer n > 1. There are n^2 stations on a slope of a mountain, all at different altitudes. Each of two cable car companies, A and B, operates k cable cars; each cable car provides a transfer from one of the stations to a higher one (with no intermediate stops). The k cable cars of A have k different starting points and k different finishing points, and a cable car which starts higher also finishes higher. The same conditions hold for B. We say that two stations are linked by a company if one can start from the lower station and reach the higher one by using one or more cars of that company (no other movements between stations are allowed).
Determine the smallest positive integer k for which one can guarantee that there are two stations that are linked by both companies.

Solution. k= n^2-n+1. Let us first prove that for k\ge n^2-n+1 there always exist two stations linked by both companies. Assume for the sake of contradiction it’s not true. Denote by S the set of all stations. We introduce the following binary relations <_A and <_B. For any x,y\in S we write x<_A y iff x and y are linked by the company A and x is the lower one. The relation <_B is defined respectively. Apparently, (S, <_A) and (S, <_B) are partially ordered sets. According to Dilworth’s theorem

\mbox{max antichain size in }(S,<_A)=\mbox{smallest chain decomposition of }(S,<_A)

The same holds for (S,<_B). Note that the smallest chain decomposition under <_A has at most n^2-k chains. Indeed, let C_1,C_2,\dots,C_{\ell} be such chain decomposition of (S,<_A). The maximal elements of C_1,C_2,\dots,C_{\ell} are not lower stations of any cable cars of A, because otherwise some two chains C_i,C_j can be merged generating a smaller chain decomposition. Hence, they cannot be more than |S|-k=n^2-k, otherwise all the cable cars would be less that |S|-(|S|-k)=k (each station can be lower point of only one cable car). Thus

\mbox{"max antichain size in }(S,<_A)\mbox{"} \le n^2-k<n\qquad(1)

The same holds for (S,<_B). Note now that any chain in S, <_A is antichain in S,<_B and any chain in S,<_B is antichain in S,<_A. Indeed, if x_1<_A x_2<_A\dots <_A x_m is a chain then it’s impossible any x_i,x_j to be comparable with respect to <_B otherwise these stations would be linked by both companies. Hence

\mbox{"max chain size in }(S,<_A) \mbox{"}\le \mbox{"max anticain size in }(S,<_B)\mbox{"}<n\qquad(2)

But, again by Dilworth’s theorem, we have

\mbox{(max antichain size in }(S,<_A)\mbox{)}\cdot \mbox{(max chain size in }(S,<_A)\mbox{)}\ge |S|=n^2

which contradicts (1) and (2).

A counterexample for k=n^2-n. The cars of company A are \ell\to \ell+n, \ell=1,2,\dots,n^2-n. The cars of B are defined as \ell\to\ell+1,\ell\not\equiv 0\pmod{n},\ell=1,2,\dots,n^2-1.

A kind of generalization.

The same argument as above allows us to prove the following claim.
Let S, |S|=m be a set endowed with two partial orders <_1 and <_2 for which the following conditions hold.
1) The size of smallest chain decomposition of (S,<_1) and (S,<_2) is at most r, where 1\le r\le m
2) Any chain in (S,<_1) is antichain in (A,<_2) and vise versa.
Then r^2\ge m.

Traps in the Probabilistic Method. Part 1.

I plan, in series of posts, to consider Olympiad problems, which can be solved using the probabilistic method. The main point is how easy one could be mistaken by not using carefully the basic concepts. This can even happen to experienced and talented persons like (former) IMO contestants. In fact, all these mistakes, I’m going to present, are made by such people. I don’t give the links intentionally. My goal is not to point out somebody’s mistakes, but to emphasize that we have to be careful when dealing with matter that often allows a wrong intuition to prevail. Intuition is a precious gift, but it should be kept under control.
Let’s start with the first example, a problem given at Middle European Mathematical Olympiad (MEMO), team competition, in 2011.

Problem 1. (MEMO 2011, 2011 – team competition. T-4) Let n \geq 3 be an integer. At a MEMO-like competition, there are 3n participants, there are n languages spoken, and each participant speaks exactly three different languages. Prove that at least \left\lceil\frac{2n}{9}\right\rceil of the spoken languages can be chosen in such a way that no participant speaks more than two of the chosen languages.

Solution (a flawed one). Take random subset S of languages, where |S|=m=\lfloor \frac{n}{3}\rfloor (every subset with equal probability). Expected value participants with degree 3 will be equal \displaystyle \frac{{m \choose{3}}}{{n \choose{3}}} \cdot 3n<3n(\frac{m}{n})^3 \leq \frac{n}{9}, so there exist subset P, where |P|=\lfloor \frac{n}{3}\rfloor and there are at most \lfloor\frac{n}{9}\rfloor guys with degree 3. Now we delete from P one language for every guy and we get subset of at least \lfloor \frac{n}{3}\rfloor -\lfloor\frac{n}{9}\rfloor languages. We’re not done only when n\equiv 1, 2, 5\pmod 9, but while proving this cases it’s sufficient to more precisely estimate expected value.

At first glance it’s ok. But look at the calculation of the expected value of the participants with degree 3 [with respect to the languages in S] . It equals the probability of one fixed student to have degree 3 multiplied by the number of the participants. Let’s see what is the probability one fixed participent p to have degree 3. In the above solution it’s \displaystyle \frac{\binom{m}{3}}{\binom{n}{3}}. But why? It’s as if the set S is fixed and the tree languages \ell_1,\ell_2,\ell_3, that p speaks, vary with equal probability! But, the staging as described at the beginning of the solution, is something different. The languages \ell_1,\ell_2,\ell_3 are fixed and the set S with m elements is randomly chosen. So, the number of the outcomes with \ell_i\in S,i=1,2,3 is exactly \displaystyle \binom{n-3}{m-3} and all outcomes are \displaystyle \binom{n}{m}. Thus, the probability in question is \displaystyle \frac{\binom{n-3}{m-3}}{\binom{n}{m}}. One can see the corrected probability leads to completely different expectation and unfortunately the approach cannot be saved. It seems, taking all subsets of m languages as a probability sample of events is not appropriate for this problem.

Same problem, but in other words.

I’ll present two other approaches, one of them non probabilistic, but let us first translate the problem into more meaningful one. The correspondence “participants – spoken languages” can be viewed as a bipartite graph G(P, L) with |P|=3n, |L|=n. Each vertex in P has degree 3. We have to show there exists L_1\subset L with at least \left\lceil\frac{2n}{9}\right\rceil vertices, such that each v\in P has a neighbour outside L_1. It means the set L':= L\setminus L_1 is a dominant set, i.e. it covers P. That’s, N(L')=S, where N(X) means the set of all neighbours of vertices in X.
Thus, the equivalent version, but more meaningful, is to prove the existence of a set L'\subset L with at most \left\lfloor \frac{7n}{9}\right\rfloor vertices such that N(L')=S.

A Non Probabilistic Approach.

A direct greedy algorithm of finding such set L' is to take a vertex \ell_1\in L with maximum degree, remove it together with all its neighbours N(\ell_1) in P and repeat doing the same procedure again till there are no elements left in P. I think, it leads to more complicated calculations than the probability approach, but it can be completed this way. This problem appeared in a popular forum some 10 years ago, with several solutions, only one of them non-probabilistic. Actually, it was mine and it was a try to exploit this straightforward algorithm. Unfortunately, with a flaw, as I see it now, but I believe it can be saved, because it works for some particular cases, e.g. n=10. However, I don’t know if it’s worth it, because there is a clearer and shorter way.

A Probabilistic Approach.

Let d be the minimum number of vertices in L that can cover P. Enumerate the vertices in L and P as \ell_1,\ell_2,\dots,\ell_n, respectively p_1,p_2,\dots,p_{3n}. Choose randomly each vertex in L with probability p, 0<p<1, where p will be determined later. Let X_i,i=1,2,\dots,n be the random variable indicating whether the vertex \ell_i is chosen, i.e. X_i=1 if \ell_i is selected, otherwise X_i=0. Let Y_i,i=1,2,\dots,3n be random variables, Y_i=1 if no one of the neighbours of the vertex p_i is chosen. Then

X_1+X_2+\dots X_n +Y_1+Y_2+\dots+Y_{3n}\ge d\qquad(1)

Indeed, X_1+X_2+\dots +X_n is the number of the chosen vertices in L. The expression Y_1+Y_2+\dots+Y_{3n} represents the number of vertices in P not covered by the marked vertices in L. Taking additionally one neighbour for each of those vertices, we construct a cover of P, hence the LHS is at least d. Note that P(X_i=1)=p, P(Y_i=1)=(1-p)^3. Taking the expectation of the LHS of (1) and using its linearity, we get

np+3n(1-p)^3\ge d\qquad(2)

Thus, to get the best possible upper bound estimate for d, we should put such p\in (0,1) that minimizes the LHS of (2). A bit of calculus shows it happens when p=2/3, which yields

\displaystyle d\le \frac{7n}{9}.

To the next part >>

An AMM Problem. Convergent Series.

Here is a problem I encountered recently. It seems that some technical calculations need to be implemented, but a surprisingly simple idea makes it solvable in the mind.

Problem. (AMM, Proposed by A. Torchinsky). Suppose that \displaystyle \sum_{n=1}^{\infty} a_n is a divergent series of positive terms, and let s_n=a_1+\dots+a_n for n=1,2,\dots . For which values of p does the series \displaystyle \sum_{n=1}^{\infty}\frac{a_n}{s_n^p} converge?

So, let’s translate everything in terms of s_n. We are given an increasing sequence of positive numbers s_n,n=1,2,\dots with \displaystyle \lim_{n\to\infty} s_n=+\infty. Set s_0:=0. The question is, for which p\in \mathbb{R} the following series always converges

\displaystyle \sum_{n=1}^{\infty} \frac{s_n-s_{n-1}}{s_n^p}.

For p\le 1 there is no chance of this happening. For example, one can take the sequence s_n=2^n, n=1,2,\dots and see the corresponding series diverges. We prove the series always converges when p>1. Moreover, we’ll prove this, even without the requirement \displaystyle \lim_{n\to\infty} s_n=+\infty.

Assume 2^k\le s_{m}<s_{m+1}<\dots <s_{\ell}<2^{k+1} for some k\in \mathbb{Z}_{\ge 0}. Then the following estimate holds

\displaystyle \frac{s_m-s_{m-1}}{s_m^p}+\frac{s_{m+1}-s_m}{s_{m+1}^p}+\dots+\frac{s_{\ell}-s_{\ell-1}}{s_{\ell}^p}\le

\displaystyle \le \frac{2^{k+1}}{2^{kp}} +\frac{s_{m+1}-s_m}{2^{kp}}+\dots+\frac{s_{\ell}-s_{\ell-1}}{2^{kp}}\le

\displaystyle \le \frac{2^{k+1}}{2^{kp}} + \frac{2^{k+1}-2^k}{2^{kp}}\le 2\frac{1}{2^{(p-1)k}}\qquad (1)

Next, we estimate the part of the sum corresponding for the small values of s_n (if such ones exist) for which 0<s_n<1. Assume 0<s_1<s_2<\dots<s_{\ell}<1. Then

\displaystyle \frac{s_1}{s_1^p}+\frac{s_2-s_1}{s_2^p}+\dots+\frac{s_{\ell-s_{\ell-1}}}{s_{\ell}^p}\le \frac{1}{s_1^p}\sum_{i=1}^{\ell} (s_i-s_{i-1})\le \frac{1}{s_1^p}\qquad (2)

Taking into account (1) and (2) it yields

\displaystyle \sum_{n=1}^{\infty} \frac{s_n-s_{n-1}}{s_n^p}\le \frac{1}{s_1^p} +2\sum_{k=0}^{\infty} \frac{1}{2^{(p-1)k}}.

Since the RHS converges, the series in the LHS converges too.


Miklos Schweitzer 2020, Problem 2.

Problem 2 (Miklos Schweitzer 2020). Prove that if f: \mathbb{R} \to \mathbb{R} is a continuous periodic function and \alpha \in \mathbb{R} is irrational, then the sequence \displaystyle \big\{n\alpha+f(n\alpha)\big\}_{n=1}^{\infty} modulo 1 is dense in [0,1].


Let T>0 be the period of f. We define f_1(x):=f(Tx),x\in \mathbb{R}. Then f_1 is periodic with period 1 and \displaystyle f(n\alpha)=f_1(n\frac{\alpha}{T}). Denote \beta:=\alpha/T. Thus, we have to prove \displaystyle \big\{n\alpha+f_1(n\beta)\big\}_{n=1}^{\infty} is dense modulo 1, where f_1 is continuous and periodic with period 1. It’s known that in case 1,\alpha,\beta are linearly independent over \mathbb{Q} then (\{n\alpha\},\{n\beta\}) is dense in [0,1]^2 where \{y\} hereafter means the fractional part of y. I included a proof of this fact based on the Weyl’s criterion – see the last section. So, in this case, fix some x_0\in (0,1) and let a:=f(x_0). For any \varepsilon>0 we can choose a sequence 0<n_1<n_2<\dots for which f_1(n_i\beta)=f_1(\{n_i\beta\})\in (a-\varepsilon, a+\varepsilon) and \{n_i\beta\}, i=1,2,\dots is dense in [0,1]. It easily yields \big\{n_i \alpha+f_1(n_i\beta)\big\}, i=1,2,\dots is dense in [0,1].
The second case. Suppose now, \alpha,\beta,1 are linearly dependent over \mathbb{Q}, hence

\displaystyle \beta =\frac{p_1}{q_1}\alpha+\frac{p_2}{q_2}

for some p_1,p_2\in \mathbb{Z},p_1\ne 0; q_1,q_2\in\mathbb{Z}_{>0}. We put n=q_1q_2m, m=0,1,2,\dots and get

\{n\alpha+f_1(n\beta)\} = \{q_1q_2m\alpha+f_1(p_1q_2m\alpha +mp_2q_1)\}=

=\{q_1q_2m\alpha+f_1(p_1q_2m\alpha \}

since f_1 is periodic with period 1. Denote f_2(x):=f_1(p_1q_2x). Then f_2 is also continuous and periodic with period 1. Note that \big\{q_1q_2x+f_2(x)\big \} is periodic with period 1. Hence

\displaystyle \{q_1q_2m\alpha +f_2(m\alpha)\}=\big\{q_1q_2\{m\alpha\} +f_2(\{m\alpha\})\big\}

It remains to see that g(x):=q_1q_2x+f_2(x) is continuous function in [0,1] and g(1)-g(0)=q_1q_2\in \mathbb{Z}_{>0}, hence the image of g as x\in[0,1) contains an interval with length 1. Since \{m\alpha\}, m=1,2,\dots is dense in [0,1] it follows \big\{g(\{m\alpha\})\big\} is also dense in [0,1] ans the result follows. \blacksquare

Dense sequence of vectors in the unit square.

Here is a proof of the fact used in the solution. Let \alpha, \beta\in \mathbb{R} and \alpha,\beta,1 are linearly independent over \mathbb{Q}, that’s, there do not exist \ell_1,\ell_2\in \mathbb{Z},(\ell_1,\ell_2)\neq (0,1) such that \ell_1\alpha+\ell_2\beta\in \mathbb{Z}. Then, the sequence of vectors (n\alpha,n\beta), n=0,1,\dots is dense in [0,1]^2. Moreover this sequence is equidistributed in [0,1]^2. We’ll use the following

Weyl criterion. The sequence of vectors v_n\in Q:= [0,1]^k,n=1,2,\dots is equidistributed if and only if

\displaystyle \lim_{N\to\infty}\frac{1}{N}\sum_{n=0}^{N-1} e^{2\pi i \ell\cdot v_n}=0 \qquad (*)

for any vector \ell\in \mathbb{Z}^k.

In our case, take some \ell=(\ell_1,\ell_2)\in \mathbb{Z}^2. We have

\displaystyle \sum_{n=0}^{N-1} e^{2\pi i \ell\cdot v_n}=\sum_{n=0}^{N-1} e^{2\pi i (\ell_1n\alpha + \ell_2n\beta)}=

\displaystyle =\sum_{n=0}^{N-1} e^{2\pi i n\gamma}.

where we denoted \gamma:=\ell_1\alpha+\ell_2\beta. Note that \gamma is irrational. Further

\displaystyle \sum_{n=0}^{N-1} e^{2\pi i n\gamma}=\frac{\left|e^{2\pi i N\gamma}-1\right|}{\left|e^{2\pi i\gamma }-1\right|}\leq \frac{2}{\left|1-e^{2\pi i\gamma }\right|}.

Since e^{2\pi i\gamma }\neq 1 the RHS is something bounded and not depending on N, thus the limit in (*) is zero.

CIIM 2017 Problem 2. Very Slowly Growing Function Has to be Zero.

Problem 2, CIIM 2017 . Let f :\mathbb{R} \to \mathbb{R} be a differentiable function satisfying f(0) = 0 and |f'(x)| \leq |f(x)|\cdot \log |f(x)| for every x \in \mathbb{R} with 0 < |f(x)| < 1/2. Prove that f(x) = 0, \forall x \in \mathbb{R}.


Let’s first make some observations which would lead us to the following solution. Take some very small \varepsilon := 2^{-k}. I prefer to represent \varepsilon this way because it’s easier to handle the logarithm. Think about when |f(x)| will hit \varepsilon for the first time when x increases from 0 to +\infty. Let this happens when x=x_1. Then |f(x)|<2^{-k}, \forall x\in \big[0,x_1\big) and since t|\log t| is increasing when t\in (0,1/4) we get |f'(x)|\le k2^{-k}, \forall x\in \big[0,x_1\big). It means the interval \big[0,x_1\big] cannot be too short in order a function with small derivative to arise from 0 to something. So, x_1\ge 2^{-k}/\left(k2^{-k}\right) or x_1\ge 1/k. It means in order f to jump some \Delta f=2^{-k} it needs a step \Delta x=1/k. Very slow growth rate, \displaystyle \frac{\Delta x}{\Delta y} is very, very big when k is sufficiently large. That’s why, if we do it for k=N,N-1,N-2,\dots we get f increases very slowly on an interval with length \frac{1}{N}+\frac{1}{N-1}+\dots. But the last sum can be made as large as we want, so f cannot grow up too much on any interval.


Fix some positive integer N and let

\displaystyle x_{n} = \inf \{x>0 : |f(x)|\geq 2^{-N+n}\}, n=1,2,\dots,N

Set x_0:=0. Clearly 0=x_0<x_1<\dots <x_{N} and |f(x_n)|=2^{N-n},n=1,2,\dots because f is continuous. Using the given condition we have

\displaystyle |f'(x)|\le \frac{N-n}{2^{N-n}}, \forall x\in [x_{n-1},x_n], n=1,2,\dots,N-2

since t|\log t| is increasing when t\in \big(0,1/4\big). Hence

\displaystyle x_n-x_{n-1}\ge \frac{f(x_n)-f(x_{n-1})}{(N-n)2^{-N+n}}

\displaystyle x_{n}-x_{n-1}\ge \frac{1}{2(N-n)}\,,\, n=1,2,\dots,N-2\qquad (1)

Further, we fix some m\in\mathbb{N},m>2 and sum up (1) for n=1,2,\dots,N-m. It yields

\displaystyle x_{N-m}=x_{N-m}-x_0=\frac{1}{2}\sum_{i=m}^{N-1}\frac{1}{i}\qquad (2)

Recall that |f(x)|<2^{-m},\forall x\in\big [0, x_{N-m}\big). Running N\to\infty we can make x_{N-m} as big as we want since the RHS of (2) is divergent when N\to\infty. Thus, we get |f(x)|<2^{-m},\forall x\in \big[0,\infty\big). Since m can be chosen as we want, it implies |f(x)|=0,\forall x\in \big[0,\infty\big). The case x<0 is similar.


IMC 2003, Problem 11. Part II.

In the previous post an interesting problem from IMC 2003 has been considered. I glanced at the official solution (there are actually two of them) and the second one sheds more light on the true reason why the claim fails in the case when \mathbb{R} is put instead of \mathbb{Q}. It’s not because \mathbb{R} is complete (metric space) whilst \mathbb{Q} is not, which fact was exploited in my previous post. The true reason is \mathbb{R} has too much elements – they are with the cardinality of the continuum. One may ask: why should I read this post instead of reading the official solution? Btw, I provided the corresponding link, see t the end of the post. The answer is because I didn’t like the way the official solution was written. It’s visionary, it uses ideas one should know, but the motivation is missing. Let’s recall the statement again.

Problem 5, IMC 2003, second day.
a) Show that for each function f:\mathbb{Q} \times \mathbb{Q} \rightarrow \mathbb{R}, there exists a function g:\mathbb{Q}\rightarrow \mathbb{R} with f(x,y) \leq g(x)+g(y) for all x,y\in \mathbb{Q}.
b) Find a function f:\mathbb{R} \times \mathbb{R} \rightarrow \mathbb{R}, for which there is no function g:\mathbb{R}\rightarrow \mathbb{R} such that f(x,y) \leq g(x)+g(y) for all x,y\in \mathbb{R}.

On the entry b). Some analysis first. We are searching for a counterexample f(x,y). Fix some candidate g to be the upper bound in the way it is required. Fix some x_0\in \mathbb{R}. It must hold

\displaystyle f(x_0,y)\le g(x_0)+g(y) \qquad(1)

for all y\in\mathbb{R}. Think about the growth rate of the RHS as y\to +\infty. It’s actually the growth rate of the function g(y), because adding a constant doesn’t affect too much the way a function increases. For example the function y+g(y) increases faster than c+g(y) no matter what the constant c is. So, if we want to exploit this observation, we should define the function f(x,y) in such a way that for any function g(y) there exists x_0\in \mathbb{R} for which f(x_0,y) has better growth rate than g(y) as a function of y. Fortunately, it is possible because we have too many x_0\in\mathbb{R} in our disposal and we can cover all possible growth rates.

To measure the growth rate of some function g(y) as y approaches +\infty we can only focus on the positive integer values of y, i.e. we can focus on the sequence \displaystyle \big(g(n)\big)_{n=1}^{\infty}. So, for any such sequence there must exist a sequence \displaystyle \big(f(x_0, n)\big)_{n=1}^{\infty} with better growth rate.
One final observation. The set of all possible real sequences \big( a_n\big)_{n=1}^{\infty} is not too large. It has the cardinality of the continuum (see why at the end of the post). So, there exists a bijection, enumeration of all real sequences f_x such that when x runs through R the sequence f_x=\big(f(x,n)\big)_{n=1}^{\infty} runs through all possible real sequences. This will be our counterexample. Indeed, for any sequence \big(g(n)\big)_{n}^{\infty} the sequence g_1(n):=n+g(n) has better growth rate than the sequence \big(c+g(n)\big)_{n=1}^{\infty} for any fixed constant c\in\mathbb{R}. But then, the sequence g_1(n) is presented somewhere in our enumeration f, that’s, there exists x_0\in \mathbb{R} with

\displaystyle \big( f(x_0,n)\big)_{n=1}^{\infty}\equiv \big(g_1(n)\big)_{n=1}^{\infty}

Now, for this x_0 there is no chance (1) to hold for all y\in\mathbb{N}.
It remains to define f(x,y) when y is not a natural number as being whatever we want, and we are done.

Why the family of all real sequences has cardinality of the continuum.

Let \big( x_n\big)_{n=1}^{\infty} be a sequence of reals. We write the decimal representation of each x_n

\displaystyle x_n=\overline {a_{n0}\,.\,a_{n1}a_{n2}\dots} \,,\, n=1,2,\dots

where a_{n0} is an integer and a_{ni}\in\{0,1,\dots,9\},i\ge 1. Now, we could encode the sequence \big( x_n\big)_{n=1}^{\infty} as consecutive blocks

a_{j,0}, a_{j-1,1}, a_{j-2,2},\dots, a_{0,j}\,,\, j=0,1,2,\dots

lined up in a sequence. From this sequence we can recover the original one \big( x_n\big)_{n=1}^{\infty}, thus this encoding injectively maps the set of all real sequences to the set of all integer sequences which is well known to have cardinality of the continuum.

[1] The AoPS thread
[2] The IMC official site, 2003