Tuesday, October 24, 2017

Commentary stream for TopCoder Open 2017 Finals

TopCoder Open 2017 final round is today at 19:30 CEST. Tune in at https://youtu.be/qrlDoP3JQkI to hear me, pashka and Endagorion discuss! This time we should have problem statements at the beginning of the round :)

Monday, October 23, 2017

Commentary stream for TopCoder Open 2017 Semifinal 2 with Egor

We'll try to do live commentary of TopCoder Open 2017 Semifinal 2 with Egor on https://youtu.be/AqpjM86iZlc  - tune in at 19:30 CEST!

Sunday, October 8, 2017

A randomized week

The Aug 14 - Aug 20 week contained the usual suspects: a TopCoder SRM and a Codeforces round. First off, TopCoder SRM 719 took place in the early Tuesday hours (problems, results, top 5 on the left, analysis). yanQval was a lot faster than everybody else and won the round - well done!

Codeforces Round 429 followed on Friday (problems, results, top 5 on the left, analysis). anta and LHiC have both solved their final fifth problem with only a few minutes to spare and thus obtained quite a margin at the top of the scoreboard - way to go!

Problem D in this round allowed for a cute randomized solution that in my view is a bit easier than the one from the official analysis. The problem was: you are given an array of 300000 numbers, and 300000 queries of the following form: a segment [l;r] and a number k, for which you need to find the smallest number that appears on at least 1/k of all positions in the given segment of the array. Note that k is at most 5, so the number must be quite frequent within the segment. Can you see how to use that fact to implement a simple randomized solution?

And finally, I've checked out CodeChef August 2017 Cook-Off on Sunday (problems, results, top 5 on the left, my screencast). EgorK was at the top of the standings with 4 problems for quite some time, but gennady.korotkevich has then submitted his fourth and fifth problems in rapid succession and won the competition. Congratulations!

In my previous summary, I have mentioned a problem that I have set for Google Code Jam 2017 finals: you are given a number k between 3 and 10000. Output any simple undirected graph with at most 22 vertices that has exactly k spanning trees.

The approach that works the most frequently in such problems is to learn several transitions that help cover all required numbers. For example, if we could come up with a way to transition from a graph with k spanning trees to graphs with 2k and 2k+1 spanning trees by adding one vertex, the problem would be solved. Alas, this does not seem possible, as it's not clear how to achieve that extra "+1" spanning tree.

As explicit construction turns out to be hard to impossible, we have to turn to more experimental approaches. One thing that stands out is that there's an enormous amount of graphs on 22 vertices, and while they have at most 2220 spanning trees, many have under 10000 spanning trees, and thus we can hope that for each k in the given range there are many possible answers. So the next idea is to try generating random graphs and checking how many spanning trees those graphs have using the matrix tree theorem, hoping to find all numbers between 3 and 10000 at least once.

It turns out that some amounts of trees are more difficult to achieve than others, so this approach by itself is not sufficient to generate all answers within reasonable time. However, there are numerous different ideas that help it find all required numbers faster, and thus help get the problem accepted. I won't be surprised if everybody who got this problem accepted at the Code Jam has used a different approach :) My experimentation led me down the following path:

  • Since some numbers are hard to get, we try to steer the random graphs towards those numbers. Suppose we have a "goal" number we're trying to achieve. The easiest way to do that is to avoid recreating the graph from scratch, but instead take the graph from the previous attempt, and check if it has less trees than the goal, or more. In the former case, we add a random edge to it, and otherwise remove one. As a result, we're more likely to hit the goal, and as soon as we do it, we pick another goal from the yet unseen numbers and repeat the process.
  • This process has an unfortunate bad state: when we remove an edge, we can make the graph disconnected, thus having 0 spanning trees. At this point we start adding random edges, but if the graph stays disconnected, it keeps having 0 spanning trees. Finally, we add an edge that connects the graph back, but since we have added so many other edges in the meantime, suddenly the graph has a lot of spanning trees, way more than we need. So we need to remove a lot of edges to get back to reasonable numbers, and may well jump back to 0 in the process. Having noticed this happening, I've modified the process to never remove an edge that would make the graph disconnected.
  • Finally, we get almost all numbers we need! After a minute or so, my solution has generated all numbers between 3 and 10000 except 13 and 22 (if I remember correctly). The final idea is that small numbers can still be generated by an explicit construction: a cycle with k vertices has k spanning trees. Since we're allowed at most 22 vertices, 13 and 22 can be done using a cycle.
That's all for my solution, but I have one more question for you: why is 22 so hard to get? In particular, does there exist a simple undirected graph with at most 21 vertices that has 22 spanning trees? I strongly suspect that the answer is no, but don't know for sure.

Thanks for reading, and check back later!

Wednesday, September 27, 2017

A week of 22

The Aug 7 - Aug 13 week was centered around Google Code Jam final rounds. First off, the 21 finalists competed in the Distributed Code Jam final round on Thursday (problems, results, top 5 on the left, analysis). ecnerwala has snatched the first place with just 3 minutes left in the contest on his 6th attempt on D-small - here's for perseverance :) Congratulations!

One day later, the traditional Google Code Jam final round saw its 26 finalists compete for the first place (problems, results, top 5 on the left, analysis, commentary stream). Gennady.Korotkevich has won the Code Jam for the fourth year in a row - well done! He has given everybody else a chance this time by solving only two large inputs, but his competitors could not catch up for various reasons. SnapDragon was leading going into the system test phase, but he knew himself that his solution for D-large was incorrect as it was not precise enough. Second-placed zemen could have claimed the victory had he solved C instead of E-small or F-small or both in the last hour.

I have set the said problem C for this round, which went like this: you are given a number k between 3 and 10000. Output any simple undirected graph with at most 22 vertices that has exactly k spanning trees. I don't think it is solvable just in one's head, but if you have some time to experiment on a computer, I think it's an interesting problem to try!

In my previous summary, I have mentioned a TopCoder problem: given a tree with n<=250 vertices, you need solve a system of at most 250 equations on this tree with m<=250 variables x1, x2, ..., xm , where each variable must correspond to some vertex in the tree. Several variables can correspond to the same vertex. Each equation has the following form: in the path from vertex xai to vertex xbi the vertex closest to vertex pi must be the vertex qi.

If we look at the tree as rooted at vertex pi, the corresponding equation means that both xai and xbi must be in the subtree of the vertex qi, and there must be no child of qi such that both xai and xbi are in the subtree of this child. At this point one had to strongly suspect this problem will reduce to a 2-SAT instance with boolean variables corresponding to "this variable is in this subtree". The equations are already in 2-SAT form on these boolean variables as described above, but the tricky part is to come up with additional clauses that guarantee that the values of boolean variables are consistent and we can pick one vertex for each variable.

First difficulty is that the tree is rooted differently for each equation. However, we can notice that each subtree of any re-rooting of the given tree is either a subtree of this tree when rooted at the first vertex, or a complement of such subtree. So if we have boolean variables for "xi is in the subtree of vertex y when rooted at vertex 1", then the complementary subtrees can be expressed simply as negations of those variables, as each xi must go to some vertex.

Second difficulty is ensuring that when we look at all subtrees which contain a given xi according to our boolean variables, they have exactly one vertex in their intersection that we can assign xi to. It turns out that this can be achieved using simple clauses "if xi is in the subtree of a vertex, it's also in the subtree of its parent, and not in subtrees of its siblings", which are handily in 2-SAT form. We can then find the deepest vertex containing xi in its subtree according to the values of the boolean variables, and it's not hard to see that all clauses will be consistent when xi is placed in that vertex.

Thanks for reading, and check back later for more competitive programming news!

Sunday, September 24, 2017

A 2-SAT week

The contests of July 31 - Aug 6 week started with the second competition day of IOI 2017 on Tuesday (problems, overall results, top 5 on the left). As expected, only the three highest scorers of day 1 had a real chance for the first place, and Yuta Takaya has grabbed this chance with the amazing score of 300 out of 300. Well done!

On Saturday, my fruitless attempts to qualify for the TopCoder Open onsite in Buffalo started with TopCoder Open 2017 Round 3A (problems, results, top 5 on the left, analysis). tourist has claimed the first place thanks to solving both the easy and the hard problem - congratulations - but qualification was also possible with easy+challenges, as Scott and Igor have demonstrated.

I could not figure out the solution for the hard problem in time, even though I suspected it would involve a reduction to 2-SAT from the very beginning. Can you see the way?

You were given a tree with n<=250 vertices and need solve a system of at most 250 equations on this tree with m<=250 variables x1, x2, ..., xm , where each variable must correspond to some vertex in the tree. Several variables can correspond to the same vertex. Each equation has the following form: in the path from vertex xai to vertex xbi the vertex closest to vertex pi must be the vertex qi.

Somewhat orthogonally to this particular problem, I felt like my general skill of reduction to 2-SAT could use some improvement. It seems that there are some common tricks that allow to replace complex conditions with just "or"s of two variables. For example, suppose we want a constraint "at most one of this subset of variables is true". It could naively be translated into n2 2-SAT clauses. However, we can introduce auxiliary variables to get by with only O(n) clauses: "bi must be true when at least one of the first i original variables ai is true" with clauses bi->bi+1 and ai->bi. Disallowing two true variables is then done via the clause bi->!ai+1.

What are the other cool 2-SAT reduction tricks?

Thanks for reading, and check back as I continue to remember August and September :)

Sunday, July 30, 2017

A week with two trees

This week's contests were clustered on Sunday. In the morning, the first day of IOI 2017 took place in Tehran (problems, results, top 5 on the left, video commentary). The "classical format" problems wiring and train were pretty tough, so most contestants invested their time into the marathon-style problem nowruz. Three contestants have managed to gain a decent score there together with 200 on the other problems, so it's quite likely that they will battle for the crown between themselves on day 2. Congratulations Attila, Yuta and Mingkuan!

A few hours later, Codeforces Round 426 gave a chance to people older than 20 :) (problems, results, top 5 on the left). The round contained a bunch of quite tedious implementation problems, and Radewoosh and LHiC have demonstrated that they are not afraid of tedious implementation by solving 4 out of 5. Well done!

In my previous summary, I have raised a sportsmanship question, asking whether it's OK to bail out of an AtCoder contest without submitting anything. The poll results came back as 42% OK, 28% not OK, and 30% saying that the contest format should be improved. As another outcome of that post, tourist and moejy0viiiiiv have shared their late submission strategies that do not consider bailing out as a benefit, so I was kind of asking the wrong question :) I encourage you to read the linked posts to see how the AtCoder format actually makes the competition more enjoyable.

I have also mentioned a bunch of problems in that post, so let me come back to one of them: you are given two rooted trees on the same set of 105 vertices. You need to assign integer weights to all vertices in such a way that for each subtree of each of the trees the sum of weights is either 1 or -1, or report that it's impossible.

First of all, we can notice that the value assigned to each vertex can be computed as (the sum of its subtree) - (the sum of subtrees of all its children). Since all said sums are either -1 or 1, the parity of this value depends solely on the parity of the number of children. So we can compare said parities in two trees, and if there's a mismatch, then there's definitely no solution.

Now, after playing a bit with a few examples, we can start to believe that in case all parities match, there is always a solution. During the actual contest I've implemented a brute force to make sure this is true. Moreover, we can make an even stronger hypothesis: there is always a solution where all values are -1,0 or 1. Again, my brute force has confirmed that this is most likely true.

For vertices where the value must be even, we can assume it's 0. For vertices where the value must be odd, we have two choices: -1 or 1. Let's call the latter odd vertices.

Our parity check guarantees that each subtree of each tree will have an odd number of odd vertices, in order to be able to get -1 or 1 in total. So we must either have k -1's and k+1 1's, or k+1 -1's and k 1's. Here comes the main idea: in order to achieve this, it suffices to split all odd vertices into pairs in such a way that in each subtree all odd vertices but one form k whole pairs, and each pair is guaranteed to have one 1 and one -1.

Having said that idea aloud, we can very quickly find a way to form the pairs: we would just go from leaves to the root of the tree, and each subtree will pass at most one unpaired odd vertex to its parent. There we would pair up as many odd vertices as possible, and send at most one further up.

Since we have not one, but two trees, we will get two sets of pairs. Can we still find an assignment of 1's and -1's that satisfies both sets? It turns out we can: the necessary and sufficient condition for such assignment to exist is for the graph formed by the pairs to be bipartite. And sure enough, since the graph has two types of edges, and each vertex has at most one edge of each type, any cycle must necessarily have the edge types alternate, and thus have even length. And when all cycles have even length, the graph is bipartite.

Looking back at the whole solution, we can see that the main challenge is to restrict one's solution in the right way. When we work with arbitrary numbers, there are just too many dimensions in the problem. When we restrict the numbers to just -1, 0, and 1, the problem becomes more tractable, and it becomes easier to see which ideas work and which don't. And when we add the extra restriction in form of achieving the necessary conditions by pairing up the vertices, the solution flows very naturally.

Thanks for reading, and check back next week!

Sunday, July 23, 2017

An unsportsmanlike week

Yandex.Algorithm 2017 Final Round took place both in Moscow and online on Tuesday (problems, results, top 5 on the left). My contest started like a nightmare: I was unable to solve any of the given problems during the first hour, modulo a quick incorrect heuristic submission on problem F. As more and more people submitted problems D (turned out to be a straightforward dynamic programming) and E (turned out to be about finding n-th Catalan number), I grew increasingly frustrated, but still couldn't solve them. After my wits finally came back to me after an hour, all problems suddenly seemed easy, and I did my best to catch up with the leaders, submitting 5 problems. Unfortunately, one of them turned out to have a very small bug, and I was denied that improbable victory :)

Congratulations to tourist, W4yneb0t and rng.58 on winning the prizes!

Here's the Catalan number problem that got me stumped. Two people are dividing a heap of n stones. They take stones in turns until none are left. At their turn, the first person can take any non-zero amount of stones. The second person can then also take any non-zero amount of stones, but with an additional restriction that the total amount of stones he has after this turn must not exceed the total amount of stones the first person has. How many ways are there for this process to go to completion, if we also want the total amount of stones of each person to be equal in the end?

TopCoder Open 2017 Round 2C was the first event of the weekend (problems, results, top 5 on the left, parallel round resultsmy screencast). The final 40 contestants for the Round 3 were chosen, and kraskevich advanced with the most confidence of all, as he was the only contestant to solve all problems. Congratulations!

This round contained a very cute easy problem. Two teams with n players each are playing a game. Each player has an integer strength. The game lasts n rounds. In the first round, the first player of the first team plays the first player of the second team, and the stronger player wins. In the second round, the first two players of the first team play the first two players of the second team, and the pair with the higher total strength wins. In the i-th round the strengths of the first i players in each team are added up to determine the winner. You know the strengths of each player, and the ordering of the players in the second team. You need to pick the ordering for the players in the first team in such a way that the first team wins exactly k rounds. Do you see the idea?

And finally, AtCoder Grand Contest 018 took place on Sunday (problems, results, top 5 on the left, analysis, my screencast with commentary). tourist has adapted his strategy to the occasion, submitting the first four problems before starting work on the last two, but in the end that wasn't even necessary as he also solved the hardest problem and won with a 15 minute margin. Well done!

As you can see in the screencast, I was also trying out this late submission strategy this time, and when I solved the first four and saw that Gennady has already submitted them, I was quite surprised. There was more than an hour left, so surely I'd be able to solve one more problem? And off I went, trying to crack the hardest problem because it gave more points and seemed much more interesting to solve than the previous one.

I have made quite a bit of progress, correctly simplifying the problem to the moment where the main idea mentioned in the analysis can be applied, but unfortunately could not come up with that idea. That left me increasingly anxious as the time ran out: should I still submit the four problems I have (which turned out to be all correct in upsolving), earning something like 40th place instead of 10th or so that I would've got had I submitted them right after solving? Or should I avoid submitting anything, thus not appearing in the scoreboard at all and not losing rating, but showing some possibly unsportsmanlike behavior?

I have to tell you, this is not a good choice to have. Now I admire people who can pull this strategy off without using the escape hatch even more :) To remind, the benefits of this strategy, as I see them (from comments in a previous post), are:
1) Not giving information to other competitors on the difficulty of the problems.
2) Not allowing other competitors to make easy risk/reward tradeoffs, as if they know the true scoreboard, they might submit their solutions with less testing when appropriate.

I ended up using the escape hatch, which left me feeling a bit guilty, but probably more uncertain than guilty. Do you think this is against the spirit of competition, as PavelKunyavskiy suggests? Please share your opinion in comments, and also let's have a vote:

Is it OK to bail out of a contest after a poor performance with late submit strategy at AtCoder?

Here's the hardest problem that led to all this confusion: You are given two rooted trees on the same set of 105 vertices. You need to assign integer weights to all vertices in such a way that for each subtree of each of the trees the sum of weights is either 1 or -1, or report that it's impossible. Can you do that?

In my previous summary, I have mentioned a hard VK Cup problem. You are given an undirected graph with 105 vertices and edges. You need to assign non-negative integer numbers not exceeding 106 to the vertices of the graph. The weight of each edge is then defined as the product of the numbers at its ends, while the weight of each vertex is equal to the square of its number. You need to find an assignment such that the total weight of all edges is greater than or equal to the total weight of all vertices, and at least one number is nonzero.

The first idea is: as soon as the graph has any cycle, we can assign number 1 to all vertices in the cycle, and 0 to all other vertices, and we'll get what we want. So now we can assume the graph is a forest, and even a tree.

Now, consider a leaf in that forest, and assume we know the number x assigned to the only vertex this leaf is connected to. If we now assign number y to this leaf, then the difference between the edge weight and the vertex weight will increase by x*y-y*y. We need to pick y to maximize this difference, which is finding the maximum of a quadratic function, which is achieved when y=x/2, and the difference increases by x2/4.

Now suppose we have a vertex that is connected to several leaves, and to one other non-leaf vertex for which we know the assigned number x. After we determine the number y for this vertex, all adjacent leaves must be assigned y/2, so we can again compute the difference as the function of y and find its maximum. It will be a quadratic function again, and the solution will look like x*const again.

Having rooted the tree in some way, this approach allows us to actually determine the optimal number for all vertices going from leaves to the root as multiples of the root number, and we can check if the overall delta is non-negative or not.

There's an additional complication caused by the fact that the resulting weights must be not very big integers. We can do the above computation in rational numbers to make all numbers integer, but they might become quite big. However, if we look at the weight differences that some small trees get, we can notice that almost all trees can get a non-negative difference, and come up with a case study that finds a subtree in a given tree which still has a non-negative difference but can be solved with small numbers. I will leave this last part as an exercise for the readers.

Wow, it feels good to have caught up with the times :) Thanks for reading, and check back next week!